[189] in linux-scsi channel archive
SCSI Reset Functions.
daemon@ATHENA.MIT.EDU (Eric Youngdale)
Sun May 21 16:51:21 1995
Date: Sun, 21 May 95 16:23 EDT
From: eric@aib.com (Eric Youngdale)
To: "Leonard N. Zubkoff" <lnz@dandelion.com>
Cc: linux-scsi@vger.rutgers.edu, drew@boulder.openware.com
>Agreed. Does a DAT drive return UNIT_ATTENTION after a bus reset as the
>CD-ROMs do?
In theory, yes. In practice, they might not. The problem is
that UNIT_ATTENTION also comes with a change in media. I would
suggest that a flag be set in the Scsi_Device structure indicating a
reset. This could be set by the low level reset function. The upper
levels of device drivers would decide how to handle this, and when to
clear it.
>which is one of the reasons I'm leaning in that direction. I'm much more
>concerned about stability and recovery of the system as a whole than of an
>individual device. My system hardware is also sufficiently error-free that I
>had to attack a sacrificial CD-ROM with a Sharpie to generate these errors.
Yes, I agree. Another concern I have had is with cache
coherency for caching controllers and the cache on disk drives that do
write caching. If putting a cdrom in the drive that has been written
on with a sharpie marker corrupts the hard drive (same goes for a tape
with a bad spot on it), then I would rather not do it at all rather
than do it wrong. To put it another way, it needs to be carefully
thought out first.
> Most other devices should not be that bothered by a bus reset,
> except for removable disks where the door lock will be released.
>
>I take it they don't spin down then? It appears that CD-ROMs first return
>UNIT_ATTENTION and then NOT_READY, which I'm assuming is due to a spin-down
>upon reset.
They might spin down - I am not sure. They should spin up again
if you need them, so this is not really an issue.
> FWIW, this is the next thing I was planning to add better support
> for after the 1.3 series gets started.
>
>Sounds good. I'd definitely like to see a two phase reset sequence, the first
>being Bus Device Reset, and the second a full SCSI Bus reset.
This is sort of what I had in mind. Try the safer alternative
first, and if this fails then reach for the dynamite.
> by writing a special string to this "file". My concern is that a bus
> reset is a pretty severe thing to do, and it is something to be avoided if
> at all possible, and if any other course of action is available.
>
>That does sound dangerous. I'm afraid that in the cases where I can see using
>that feature, it's quite likely the SCSI bus would already be too wedged to run
>the special command anyway.
Yeah, I know. The problem has always been in deciding when the scsi
system is so badly wedged that we need to do a reset, and to never perform
the reset if it were not required.
>On a more positive note, I've turned on Tag Queuing in my new driver and it
>seems to be working just fine in my testing so far.
In theory you should get better performance with this. Have you observed any improvement :-)?
Have you tried getting a command abort to work? On the 1542 this just
made things worse, so it is a no-op in the 1542 driver.
Before we start coding, we should probably work out the exact
details to how we handle a few things:
* Who decides whether we should perform a bus reset or a bus device
reset, and under what conditions should we perform the bus reset.
For this, I was thinking that we could check to see whether all of
the other active commands on the bus have timed out - if so then
we could consider it safe to perform the bus reset.
* If we send a bus reset who sends the command completion notification
to the other outstanding commands on the bus? scsi.c, or the
low level driver?
How do we handle boards that keep track of the outstanding commands
and automatically restart them. How about boards that automatically
report the commands failed with a DID_RESET messgae. How about
boards that completely flush all memory of the outstanding commands.
The driver itself must be able to account for these differences,
so the author of the driver must know what the board is going to
do.
* For removeable media, should we attempt to relock the door?
Should we pass a message up indicating doorlock required?
This requires that another scsi command be sent, so we need
to be careful here. Currently if we get a message from the
device indicating a reset has taken place, we retry the command.
* Tape drives. I suggest that all access should fail after a reset
until the user rewinds the tape, or we get a media change. This
could be tricky, because a UNIT ATTENTION is also used to indicate
a media change, so it might be difficult to tell whether the
tape position is no longer lost. Do we want to force the
user to rewind in all cases? Can we query the device to find
out if it is at BOT, and if so clear the flag, else command
fail?
-Eric