[677] in linux-scsi channel archive

home help back first fref pref prev next nref lref last post

Re: Problem with upper .5GB of 2GB disks

daemon@ATHENA.MIT.EDU (Jurgen Botz)
Tue Oct 10 21:13:21 1995

To: linux-scsi@vger.rutgers.edu, ncr53c810@mroe.cs.colorado.edu
In-reply-to: Your message of "Mon, 09 Oct 1995 19:28:24 EDT."
             <199510092328.TAA27469@orixa.mtholyoke.edu> 
Date: Tue, 10 Oct 1995 09:25:05 -0400
From: Jurgen Botz <jbotz@orixa.mtholyoke.edu>

I wrote:
> I have an ASUS PCI system with 3 ASUS SC200 (NCR 53c810) controllers
> and 6 IBM 2GB SCSI disks (2 on each controller).  I'm seeing a very
> strange problem... every single one of the disks was giving I/O errors
> on sectors above roughly 3000.

My mind was getting fuzzy from frustration when I wrote that... I meant
3000000, not 3000.  Also that isn't right anyway, I'm getting I/O errors
closer to the half-way mark one some disks.  

Since the message of last night I've done more experiments and here are
some of the results (using 1.3.32+NCRrel11)...

- I/O errors occur on every disk, starting somewhere on the second
half of the disk.  Once I hit an error, every EVEN sector beyond that
one reports an error.  There are no reports of I/O errors on odd sectors.
I don't know if this is a clue or just an artifact of the way block
devices are handled by the kernel.

- If I repeat the scan (using "badblocks" as a means of reading every
sector), I may get the errors earlier on the disk.

- The disks are not bad... I've tried one in a non-Linux system and had
no problems.

- I'm quite sure that there are no cable or termination problems.

- After getting a bunch of errors I interrup the scan... often I then
see the following syslog message: "kernel: Weird - unlocked, clean and
not uptodate buffer on list ..."

- Sometimes the kernel eventually dies altogether after so many
of the SCSI errors.  It locks up and won't respond to attempts
to switch to a different console nor pings from another system.

Here's an example:

# badblocks -o /tmp/bad.sdb -v /dev/sdb3 1805440
# grep sector /var/log/messages

Oct 10 06:49:58 news2 kernel: scsidisk I/O error: dev 08:13, sector 2351576
Oct 10 06:50:06 news2 kernel: scsidisk I/O error: dev 08:13, sector 2351578
Oct 10 06:50:14 news2 kernel: scsidisk I/O error: dev 08:13, sector 2351580
Oct 10 06:50:23 news2 kernel: scsidisk I/O error: dev 08:13, sector 2351582
Oct 10 06:50:31 news2 kernel: scsidisk I/O error: dev 08:13, sector 2351584
Oct 10 06:50:39 news2 kernel: scsidisk I/O error: dev 08:13, sector 2351586
 ...

I repeat the scan and get errors sooner:

Oct 10 07:13:14 news2 kernel: scsidisk I/O error: dev 08:13, sector 2349560
Oct 10 07:13:23 news2 kernel: scsidisk I/O error: dev 08:13, sector 2349562
Oct 10 07:13:31 news2 kernel: scsidisk I/O error: dev 08:13, sector 2349564
Oct 10 07:13:39 news2 kernel: scsidisk I/O error: dev 08:13, sector 2349566
Oct 10 07:13:48 news2 kernel: scsidisk I/O error: dev 08:13, sector 2349568
 ...

Any help or feedback or speculation would be appreciated.

--
Jurgen Botz, jbotz@mtholyoke.edu                            C:\ONGRATNS.W95!
"Unix?  What's that?  Is that like Linux?"

home help back first fref pref prev next nref lref last post