[3028] in linux-scsi channel archive
RE: RAID & unhappy scsi driver
daemon@ATHENA.MIT.EDU (Doug Ledford)
Mon Jan 5 04:40:57 1998
In-Reply-To: <34B0AD32.1B21E4AC@fc.net>
Date: Mon, 05 Jan 1998 03:16:30 -0600 (CST)
From: Doug Ledford <dledford@dialnet.net>
To: Linas Vepstas <linas@fc.net>
Cc: linux-eata@trudi.zdv.Uni-Mainz.DE, linux-scsi@vger.rutgers.edu,
linux-raid@vger.rutgers.edu
On 05-Jan-98 Linas Vepstas wrote:
>I am disappointed to point out the following kernel "bug":
>
>Recently set up RAID w/ several seagates & adaptec 2940 on 2.0.33
>kernel.
>After a few weeks, one of the drives failed.
>
>I was unhappy to find the machine all-but locked up as a result,
>un pingable, un telnetable, etc. (although the keyboard did wake
>up the sleeping monitor.) Appearently the aic7xxx driver entered
>into some sort of infinite loop attempting to reset the scsi disk.
Not likely. More likely the mid level SCSI code sent the same commands back
time after time and they timed out resulted in the SCSI code calling the
aic7xxx_reset() routine repeatedly.
>I was unable to reboot until I went into bios and disabled the disk.
This is definitely a sign of something to do with the disk and not the
driver or controller.
>This kind of driver behaviour completely negates the point of
>hot-plug drive bays, severly impacts high-availability, and puts
>a big dent in the philosphy of RAID.
The only RAID solutions intended to address issues such as this are RAID 1
and 5. What RAID are you running?
>Anyone experience anything similar? Anyone working to improve
>the driver?
I have one other person experiencing something similar, but their problem
turned out to be a bum firmware on their Fujitsu drive.
>
>--------
>It occurs to me that some RAID setups might use one controller per disk,
>
>to avoid outage due to controller failure. But are the scsi device
>drivers
>robust enough to not hang/panic the kernel if a controller fails to
>respond?
It's very likely that the controller and driver were fine. More likely is
that the particular error mode the Seagate drive went into resulted in
complete SCSI bus wedges. Keep in mind that the SCSI bus has a shared BUSY
signal pin. Any device can make that pin active. If it makes that pin
active and never releases it, then nothing, and I mean *nothing*, will take
place on that bus until the drive is removed. Now, it very well may be a
case of something like the drive will negotiate just fine and respond to the
normal inquiry commands at bootup, but the first time you try to use it (or
the first time you access a particular media location) the drive could end
up going to Kansas. In that case, there is very little the low-level driver
can do. In the event that commands don't complete, all the driver knows is
that they didn't complete. We can't point a finger at any particular device
if the bus is wedged because *any* device can wedge the bus, and they can do
so at any time regardless of what device we might be attempting to
communicate with. In that case, we simply pass all of the commands back to
the mid level SCSI code as having been reset after the mid level code calls
our reset routine. On the other hand, if a device starts having problems
without completing losing its sanity, then it will return things such as
media errors to the controller. We then pass those media errors back up to
the mid level code and the SCSI sub system says "Hey, this drive has a
problem" and handles it accordingly.
In any case, one thing I'm working on right now is implementing a new SCSI
sequencer in the aic7xxx driver, and it would be fairly easy for me to make
one change. That is, simply, modify the return code on commands based upon
the following:
Upon any successful command completion with a drive we set a flag
DEVICE_SUCCESS in our internal device array. To handle cases of multiple
resets, I can modify the return code we pass with each command. Currently,
all commands sent back due to a SCSI bus reset are sent back with the result
code of cmd->result = (DID_RESET | SUGGEST_RETRY << 8) << 16;
I can change that so it looks like this:
if (p->devices[TARGET_INDEX(scb->cmd)].flags & DEVICE_SUCCESS)
cmd->result = ((DID_RESET | SUGGEST_RETRY << 8) << 16) |
scb->hscb->target_status;
else
cmd->result = (DID_RESET << 16) | scb->hscb->target_status;
but, the simple fact of life is, with this type of failure, if the SCSI
subsystem doesn't quit sending commands to that drive sometime, then it's
going to make the machine unuseable forever. At some point, the RAID code
would have to take the target drive off line. What's worse, if the bus
reset didn't shake the flaky drive loose so that at least the other drives
could work again, then it could render every drive on that bus dead.
Additionally, even if the drive does shake loose on a reset, the code
snippet above could result in all drives going dead if every time the mid
level SCSI code sends commands back to us to be re-tried it always sends
commands for the flaky drive first and other drives later. In that case, we
would send our first command to the flaky drive, it would re-wedge the SCSI
bus, and because it was the first command sent, the other drives would never
have a chance to complete a command and reset their own DEVICE_SUCCESS
flags, which means there would be no way to differentiate between the flaky
drive and the others on the SCSI bus and they would *all* get marked as bad.
----------------------------------
E-Mail: Doug Ledford <dledford@dialnet.net>
Date: 05-Jan-98
Time: 03:16:31
----------------------------------