[689] in linux-scsi channel archive

home help back first fref pref prev next nref lref last post

Re: Serious problem with SCSI error handling

daemon@ATHENA.MIT.EDU (Drew Eckhardt)
Fri Oct 13 00:05:52 1995

To: Jurgen Botz <jbotz@orixa.mtholyoke.edu>
cc: Dave Andersen <angio@aros.net>, ncr53c810@mroe.cs.colorado.edu,
        linux-scsi@vger.rutgers.edu
In-reply-to: Your message of "Wed, 11 Oct 1995 12:32:55 EDT."
             <199510111633.MAA20527@orixa.mtholyoke.edu> 
Date: Thu, 12 Oct 1995 19:30:24 -0600
From: Drew Eckhardt <drew@poohsticks.org>

In message <199510111633.MAA20527@orixa.mtholyoke.edu>, jbotz@orixa.mtholyoke.edu wri
tes:
>
>Conclusion: the SCSI code has a bug (or bugs) that throw things into
>a bad state on certain disk errors that it *should* be able to recover
>from.  This does not appear to be in the NCR driver, but more likely
>in the higher level SCSI disk code, since Dave saw the same problem
>with a different controller.  The problem seems to exist in kernel
>versions at least 1.2.x through 1.3.32.

Or in the buffer cache code.  If an error is returned for one block in 
a request (up to 128 blocks with the current NCR driver configuration),
sometimes attempts to access all blocks that were in the same request 
return errors.

>I'm willing to put a fair amount of time into tracking-down/debugging
>this, but having no experience with SCSI driver code I doubt I could get
>very far by myself.  Furthermore I don't know if this is a simple bug or
>a major design flaw.  If any of the SCSI experts out there would like
>to work with me I would be delighted.

I've got a business trip through Wednesday; after that I've have a couple of 
Syquest cartridges with bad blocks on them that I use for testing purposes and 
could look into it if either of the SQ555s will spin up.


home help back first fref pref prev next nref lref last post