[199] in linux-scsi channel archive
SCSI Reset Functions.
daemon@ATHENA.MIT.EDU (Leonard N. Zubkoff)
Tue May 23 03:03:05 1995
Date: Mon, 22 May 1995 23:15:15 -0700
From: "Leonard N. Zubkoff" <lnz@dandelion.com>
To: eric@aib.com
Cc: linux-scsi@vger.rutgers.edu, drew@boulder.openware.com
In-Reply-To: "Eric Youngdale"'s message of Mon, 22 May 1995 10:27:45 -0400 <9505221027.ZM25047@aib.com>
From: "Eric Youngdale" <eric@aib.com>
Date: Mon, 22 May 1995 10:27:45 -0400
For the moment, I've changed my current implementation to try a Bus Device
Reset, and if there are two Bus Device Resets without a successful SCSI
command, then the next one causes a full Host Adapter and SCSI Bus Reset. I
think this is a reasonable way to go until the higher levels implement multiple
forms of reset.
When I first started thinking about this, I had no knowledge about
what cacheing controllers really do. From what I gather, the idea is that
writes go to the controller and are written as soon as possible. Same
goes for the cache on the drive. I understand that DTP controllers
with a cache do not flush the cache when a hard reset is requested.
This is something that each driver author needs to keep in mind.
We need to investigate this for the AMI FastDisk caching BusLogic clone, and I
have an alpha site lined up for testing this. Hopefully, the card will not do
a complete reset without writing back any dirty data.
The basic idea that I had was that to take advantage of tagged queueing, you
merely increase the number of outstanding commands per lun that the driver
supports. The upper level code just keeps feeding commands to the board, and
the board itself is responsible for tagging the actual commands. For DPT
boards this strategy seems to work, since Michael said that he had to
increase the number of commands per lun to 64 to get optimal performance.
You have to be aware of memory usage problems in this case - if the board
emulates a 1542, you would not want this that high if you had more than 16
Mb, for example.
I did some testing of this last night and found 3 commands per lun led to the
best performance, which makes me highly suspicious that there is a problem with
what the card is being asked to do. It just doesn't make sense that such a
small queue would be optimal. I need to look at the precise commands being
requested and see what's happening.
Yeah, I know. The details have always been a bit fuzzy, and we
have really only defined things once we start using a particular part of the
interface.
I think I've now figured out part of my confusion. I took the following from
scsi.h:
/* This means that we were able to reset the bus. We have restarted all of
the commands that should be restarted, and we should be able to continue
on normally from here. We do not expect any interrupts that will return
DID_RESET to any of the other commands in the host_queue, and the mid-level
code does not need to do anything special to keep the commands alive. */
#define SCSI_RESET_SUCCESS 2
to mean that I was responsible for restarting all the outstanding commands at
the driver level (i.e. sending them to the card again without informing the
high level). From examining other drivers like the NCR 53c810 one, I found
that most drivers set the result to DID_RESET and then called the completion
routine. It hit me today that perhaps that was the meaning of "restarting all
the commands". In retrospect, this makes sense as the higher levels need to
reset timeouts etc., but it wasn't apparent when I first read this.
This is sort of what some of the SCSI_RESET_* options are for. For
example, SCSI_RESET_SUCCESS indicates that all of the outstanding commands
for the bus have automatically been restarted, so that we do not need to
do anything. SCSI_RESET_PENDING indicates that the outstanding commands
have not been restarted, and the mid level code should restart all commands
that were on the bus. Even then, I think the meaning of the various
options needs to be spelled out a bit more so that each one has a very
precise meaning.
So what does "automatically been restarted" mean here? Is my conclusion above
correct, that setting DID_RESET and calling the completion routine is the way
one restarts them? I think this much of my confusion comes from what the
higher levels will do with a given DID_XXXXX result code.
Leonard