[1724] in linux-scsi channel archive

home help back first fref pref prev next nref lref last post

Disk gets trashed regularly w/Adaptec 2940!

daemon@ATHENA.MIT.EDU (J.D. Baldwin)
Fri Apr 18 16:19:18 1997

Date: 	Fri, 18 Apr 1997 16:09:24 -0400
From: baldwin@netcom.com (J.D. Baldwin)
To: linux-scsi@vger.rutgers.edu
Cc: dario@milano.europe.dg.com

I posted a query about this problem a while back and didn't get much
feedback--just one or two people suggesting I change adapters.  I
decided to wait for a new kernel release and try it out before taking
that step.  So, after a few days' worth of running 2.0.30, here I am,
asking the same question (in somewhat different groups) in hopes of a
more informative response.

I run 2.0.30 on an ALR Revolution MP II with a PCI bus, an Adaptec
2940 SCSI card controlling a 4G Seagate drive whose model number
I've forgotten.  (The same thing happened with a Quantum disk, so I
think my problem is in the adapter support.)

I have experienced identical problems with *four* different Adaptec
cards, so I don't think I have a bad card.

The problem: I boot the machine, I get lots of filesystem errors
during e2fsck on boot.  A sample from /var/log/syslog is attached
below.

Then the machine runs fine for 12-24 hours or so, and suddenly this
message is written to /var/log/syslog:

Apr 18 07:40:13 foo-host kernel: SCSI disk error : host 0 channel 0 id 0
    lun 0 return code = 2

At this point, the root filesystem is mounted read-only.  I can log
on as root and execute `/bin/mount -n -o remount /` and the file
system is OK.  Nothing else is ever written to syslog or messages,
but I assume that's just because syslogd stops trying to write once
it fails even once.

That's not all.  Here and there, a few inodes seem to get "shuffled."
Contents of one file get trashed and overwritten with the contents
of another file.  Attempts to delete files in a certain directory
cause an immediate crash of the entire system.  Weird stuff like
that.

As I say, this has happened with this machine with two different
makes of HD and four separate Adaptec SCSI cards.

So, what I'm looking for is:

1) a fix or workaround, or at least an indication that someone,
   somewhere is aware of this problem and working on a fix

2) a recommendation as to another model of SCSI-2 adapter.  I hard
   NCR support in Linux is pretty good.  Any specific model numbers
   in mind?

Thanks in advance for any help.  Here are a few sample errors from 
/var/log/messages as logged on boot-up:

Apr 17 14:21:32 foo-host kernel: EXT2-fs error (device 08:01):
    ext2_check_inodes_bitmap: Wrong free inodes count in group 485,
    stored = 8133, counted = 8139

Apr 17 14:21:32 foo-host kernel: EXT2-fs error (device 08:01):
    ext2_check_inodes_bitmap: Wrong free inodes count in super block,
    stored = 3998702, counted = 3999923

Apr 17 16:58:16 foo-host kernel: EXT2-fs error (device 08:01):
    ext2_readdir: bad entry in directory #687459: rec_len is too small
    for name_len - offset=0, inode=1952671082, rec_len=25632,
    name_len=29541

Apr 17 16:58:16 foo-host kernel: EXT2-fs error (device 08:01):
    ext2_readdir: bad entry in directory #687558: rec_len % != 0 -
    offset=0, inode=1629513582, rec_len=8302, name_len=28769

Apr 17 16:58:16 foo-host kernel: EXT2-fs error (device 08:01):
    ext2_find_entry: bad entry in directory #687558: rec_len % != 0 -
    offset=0, inode=1629513582, rec_len=8302, name_len=28769
--
 From the catapult of J.D. Baldwin  |+| "If anyone disagrees with anything I
   _,_    Finger baldwin@netcom.com |+| say, I am quite prepared not only to
 _|70|___:::)=}-  for PGP public    |+| retract it, but also to deny under
 \      /         key information.  |+| oath that I ever said it." --T. Lehrer
***~~~~-----------------------------------------------------------------------



home help back first fref pref prev next nref lref last post