[231] in linux-scsi channel archive

home help back first fref pref prev next nref lref last post

A few cleanups to 1.2.9, and a request for help.

daemon@ATHENA.MIT.EDU (John Newnham)
Tue Jun 6 08:39:28 1995

Date: Tue, 06 Jun 1995 21:26:52 +1000
From: John Newnham <jnewnham@broncho.ct.monash.edu.au>
To: linux-scsi@vger.rutgers.edu

[ Executive summary:  found some buglets, but still got a
  scsi timeout that is fixed by defining DEBUG in sd.c,
  seagate controller, XT-4380S hard disk, what can I do? ]

Hello *,

I have been trying to solve an argument between my
scsi host and a "new" hard drive, and in the process
I have found a few buglets remaining in the 1.2.9 source
tree.  I will send a patch to Eric or Drew if nobody here
tells me otherwise.  However, I have heard that there are
some pre-1.3 patches for the scsi system floating around.
Is this true?  If so, could some kind soul tell me where
to get them?  Then I can check to see if the following
are already fixed there, and also to see if it makes my
hard drive any happier.  If not, *shrug* I will scream
for help :-)

The buglets are (I will not include a diff or line
numbers, 'cos my source tree is instrumented at the moment,
but really 'cos I am typing this from memory :-):

Two spelling bugs in scsi.c:  "problemes" rather than
"problems", and "to" instead of "too" on the same line;

A printk() enabled by DEBUG_DELAY appears to be missing an
argument;

The line which is #ifdef notyet (near TEXEL borken support)))
has too many close parentheses;

"on" should be "one" in the comment at the top of sd.c:rw_intr().

None of these affect functionality in any way, so they are
perfect candidates for fixing in the 1.2 release ;-).


Now for my problems.  I have a 486-DX33 ISA system with 16Mb
of RAM, a Future Domain ST-01 compatible controller, which
has driven my floptical disk well for over a year.  I did not have
to do the swap control/data trick to make it work, just turn
off appropriate ROM shadowing.  I have recently acquired a
Maxtor XT-4380S hard disk, and it detects okay _most_ of the
time.  Occasionally, it comes up with garbage for the
manuf./model/rev (what I would identify as line noise on a
serial link), and this problem may be slightly worse in 1.2
than it was in 1.1.  But I can live with it (though if it
misses the sector size, the system will panic, defeating my
rc-based kludges).

BUT, under load the drive times out, the kernel tries to abort,
that times out, then the kernel tries to do a reset and ends
up panicing (sp?).  With 1.2.4 this was very easy to do (often
just the daemons in rc.M would do it), under 1.2.9 it is harder
to do, but still predictable:  cat doom.wad >>/dev/null will
get it every time.

I have been trying to find for myself where to play with the
timeouts, but the problem is that the panic occurs in the
interupt routine, and putting printk's in the interupt routine
makes the panics less likely.  In fact, the easiest fix is
to define DEBUG in sd.c, anywhere before rw_intr() (but preferably
before the #include <sched.h>), and then to add
"kern.info;kern.notice /dev/null"
in /etc/syslog.conf - this results in a system which, if it
detects the drive reliably on bootup, is 100% reliable.
(This, and the fact that the floptical works, suggests to me
that it is indeed software fault, not termination or cabling
or the drive itself).

The path of the crash is repeatable, the trace is always:
	scsi_done
	seagate_st0x_queue_command
	seagate_st0x_queue_command
	scsi_request_sense
	scsi_done						<-	hmm...
	scsi_reset
	scsi_times_out
	scan_scsis						<-	huh?
	scsi_main_timeout

with printks "command timed out", "abort timed out - resetting"
"Danger Will Robinson", "disk error", "I/O error"
"kernel NULL ptr dereference" ... register dump, not syncing.

If anybody can help me with this (or point me at some patches
that may fix it), I would much appreciate it.  I can supply
actual EIP values, stack contents etc. if needed.

ObThanks:  to Drew, Eric, Ted, H.J. Lu, and all you others
	(including, umm...  Linus!), thanks for a great system!


regards,

John Newnham
aka. ashtray

jnewnham@broncho.ct.monash.edu.au
jnewnham@ponderosa.is.monash.edu.au
ashtray@yoyo.cc.monash.edu.au

home help back first fref pref prev next nref lref last post