[1724] in linux-scsi channel archive
Disk gets trashed regularly w/Adaptec 2940!
daemon@ATHENA.MIT.EDU (J.D. Baldwin)
Fri Apr 18 16:19:18 1997
Date: Fri, 18 Apr 1997 16:09:24 -0400
From: baldwin@netcom.com (J.D. Baldwin)
To: linux-scsi@vger.rutgers.edu
Cc: dario@milano.europe.dg.com
I posted a query about this problem a while back and didn't get much
feedback--just one or two people suggesting I change adapters. I
decided to wait for a new kernel release and try it out before taking
that step. So, after a few days' worth of running 2.0.30, here I am,
asking the same question (in somewhat different groups) in hopes of a
more informative response.
I run 2.0.30 on an ALR Revolution MP II with a PCI bus, an Adaptec
2940 SCSI card controlling a 4G Seagate drive whose model number
I've forgotten. (The same thing happened with a Quantum disk, so I
think my problem is in the adapter support.)
I have experienced identical problems with *four* different Adaptec
cards, so I don't think I have a bad card.
The problem: I boot the machine, I get lots of filesystem errors
during e2fsck on boot. A sample from /var/log/syslog is attached
below.
Then the machine runs fine for 12-24 hours or so, and suddenly this
message is written to /var/log/syslog:
Apr 18 07:40:13 foo-host kernel: SCSI disk error : host 0 channel 0 id 0
lun 0 return code = 2
At this point, the root filesystem is mounted read-only. I can log
on as root and execute `/bin/mount -n -o remount /` and the file
system is OK. Nothing else is ever written to syslog or messages,
but I assume that's just because syslogd stops trying to write once
it fails even once.
That's not all. Here and there, a few inodes seem to get "shuffled."
Contents of one file get trashed and overwritten with the contents
of another file. Attempts to delete files in a certain directory
cause an immediate crash of the entire system. Weird stuff like
that.
As I say, this has happened with this machine with two different
makes of HD and four separate Adaptec SCSI cards.
So, what I'm looking for is:
1) a fix or workaround, or at least an indication that someone,
somewhere is aware of this problem and working on a fix
2) a recommendation as to another model of SCSI-2 adapter. I hard
NCR support in Linux is pretty good. Any specific model numbers
in mind?
Thanks in advance for any help. Here are a few sample errors from
/var/log/messages as logged on boot-up:
Apr 17 14:21:32 foo-host kernel: EXT2-fs error (device 08:01):
ext2_check_inodes_bitmap: Wrong free inodes count in group 485,
stored = 8133, counted = 8139
Apr 17 14:21:32 foo-host kernel: EXT2-fs error (device 08:01):
ext2_check_inodes_bitmap: Wrong free inodes count in super block,
stored = 3998702, counted = 3999923
Apr 17 16:58:16 foo-host kernel: EXT2-fs error (device 08:01):
ext2_readdir: bad entry in directory #687459: rec_len is too small
for name_len - offset=0, inode=1952671082, rec_len=25632,
name_len=29541
Apr 17 16:58:16 foo-host kernel: EXT2-fs error (device 08:01):
ext2_readdir: bad entry in directory #687558: rec_len % != 0 -
offset=0, inode=1629513582, rec_len=8302, name_len=28769
Apr 17 16:58:16 foo-host kernel: EXT2-fs error (device 08:01):
ext2_find_entry: bad entry in directory #687558: rec_len % != 0 -
offset=0, inode=1629513582, rec_len=8302, name_len=28769
--
From the catapult of J.D. Baldwin |+| "If anyone disagrees with anything I
_,_ Finger baldwin@netcom.com |+| say, I am quite prepared not only to
_|70|___:::)=}- for PGP public |+| retract it, but also to deny under
\ / key information. |+| oath that I ever said it." --T. Lehrer
***~~~~-----------------------------------------------------------------------