[747] in linux-scsi channel archive
Segment violations during disk access
daemon@ATHENA.MIT.EDU (Louis Mandelstam)
Thu Nov 2 10:24:42 1995
Date: Thu, 2 Nov 1995 11:36:03 +0200 (GMT+0200)
From: Louis Mandelstam <louis@sacc.wn.apc.org>
To: linux-scsi@vger.rutgers.edu
Good day -
I've been experiencing a problem with the SCSI subsystem on a 1.2.13
machine, with disk accesses sometimes causing the calling process to
crash, occasionally taking the system with it.
One sure way the get the problem to show itself is to do a badblocks
/dev/sda1 <blocksondisk> - badblocks almost always dies with a 'Segment
violation' before completion.
I've tried replacing the hard disk (a Conner 1080S) with a Seagate
Baracudda which is known to work fine, and the problem persisted. I've
also taken out all other SCSI devices and the problem wasn't affected.
I've also tried going to single user and running badblocks without any
other processes possibly accessing the disk, without any change.
I have not been able to try a different SCSI adapter as yet. One warning
which I have been wondering about is the output fdisk gives me when I run it:
'The number of cylinders for this disk is set to 1030.
This is larger than 1024, and may cause problems with some software.'
The partition table -
Disk /dev/sda: 64 heads, 32 sectors, 1030 cylinders
Units = cylinders of 2048 * 512 bytes
Device Boot Begin Start End Blocks Id System
/dev/sda1 1 1 1030 1054719+ 83 Linux native
Partition 1 has different physical/logical endings:
phys=(1023, 63, 32) logical=(1029, 63, 32)
Partition 1 does not start on cylinder boundary:
phys=(0, 0, 2) should be (0, 1, 1)
I had the physical and logicals the same at one stage (both 1023 cyls)
and the problem was as it is now.
/proc/pci -
PCI devices found :
Bus 0 Device 13 Function 0.
SCSI bus controller : Adaptec 2940 (rev 0). 8259's interrupt 14.
Bus 0 Device 14 Function 0.
Ethernet controller : DEC DC21040 (rev 35). 8259's interrupt 15.
Bus 0 Device 16 Function 0.
Host bridge : UMC UM8881F (rev 1).
Bus 0 Device 18 Function 0.
ISA bridge : UMC UM8673F (rev 1).
The AHA2940 has extended translation disabled. I've tried with and
without Disconnection enabled, and it doesn't appear to affect the problem.
I don't know much about SCSI, and I could be missing something obvious.
I have tried known incorrect termination as well as known correct
termination setups, I could see no change in the system's behaviour.
I've done a low-level format of the drive in the Adaptec BIOS - the
problem wasn't affected. Also, the Adaptec BIOS's disk verify routine
finds no problems.
Any pointers, ideas for tests to try etc would be *severely*
appreciated. This computer is supposed to be our champion server and at
the moment it's not reliable enough because of this problem.
Many thanks in advance.
Regards
----------------------------------------------------------------------
L.Mandelstam - System Administrator louis@sacc.wn.apc.org
S A Council of Churches, PO Box 4921, Johannesburg, 2000, South Africa
tel:+27-11-492-1380 x249 fax:+27-11-492-1448
----------------------------------------------------------------------