[2380] in linux-scsi channel archive

home help back first fref pref prev next nref lref last post

Re: BIG-SCSI-Problems

daemon@ATHENA.MIT.EDU (Gerard Roudier)
Fri Aug 29 11:10:50 1997

Date: 	Fri, 29 Aug 1997 16:32:51 +0200 (MET DST)
From: Gerard Roudier <groudier@club-internet.fr>
To: Klaus Dombrofsky <KDombrofsky@compuserve.com>
cc: linux-scsi@vger.rutgers.edu
In-Reply-To: <199708290642_MC2-1E88-6A31@compuserve.com>



On Fri, 29 Aug 1997, Klaus Dombrofsky wrote:

> Hi there,
> i have a very big problem with my linux-machines.
> 
> linux1:
> P100
> 32 MB
> Adaptec 2940UW with external ASS3000-Raid
> Symbios 53C81xx with external CDROM, TAPE, JAZ
> Kernel: 2.0.30
> 
> linux2
> P200
> 128 MB
> Adaptec 2940UW with external ASS3000-Raid
> Symbios 53C81xx with internal CDROM, TAPE, JAZ
> Kernel 2.0.30
> 
> Problem:
> When i make a very frequent access on a device on the symbios then it comes
> often to a scsi-reset and after this reset on the symbios-controller the
> system hangs, but theres no scsi-reset on the adaptec with the raid and the
> operating-system.
> As long as i make no access on the symbios-controller the system works
> well. But when i want 
> to copy a lot of data onto the jaz or drive, then it cames to a scsi-reset
> often, but not always.
> 
> Here is /var/log/messages
> 
> Aug 29 11:42:23 linux1 kernel: scsidisk I/O error: dev 08:11, sector 279056
> Aug 29 11:42:26 linux1 kernel: SCSI disk error : host 1 channel 0 id 1 lun
> 0 return code = 28000002
> Aug 29 11:42:26 linux1 kernel: Current error sd08:11: sns = f0  4
> Aug 29 11:42:26 linux1 kernel: ASC=15 ASCQ=be

I donnot know what ASCQ=be means, but ASC=15 is a positionning error 
reported by the drive.

> Aug 29 11:42:26 linux1 kernel: Raw sense data:0xf0 0x00 0x04 0x00 0x04 0x42
                                                            ^^
Sense key = 4 indicates an unrecoverable hardware error.

> 0x32 0x11 0x00 0x00 0x00 0x00 0x15 0xbe 0x00 0x00

> I already checked the cables and the terminators.

If the JAZ drive is id 1, you should check the drive, try severall
different medias, etc...

> Is it a problem to disable DISCONNECT on the symbios ?

Problems dues to disabling DISCONNECT does not depend on controller type.
With your configuration, a very long TAPE scsi command (for example 
a rewind) will time-out pending CD/ROM or JAZ scsi commands if any.
With DISCONNECT allowed, the TAPE releases the SCSI bus, allowing pending 
commands to be send to devices for execution.

> Why does the reset on symbios hang the whole system ?

With recent driver versions, I usually do testings that trigger a SCSI
reset every 30 seconds and my system does'not hang.
I noticed that the drive failure has corrupted some directory block.
Could be the cause of the hang.
Let me know which ncr53c8xx driver version you're currently using.

> An additional Problem on the linux2:
> Sometimes, when there are many read/write-accesses on the adaptec with the
> raid-system theres also a scsi-reset and then the ystem hangs, it must hang
> because the OS is running on the raid.
> But sometimes the raid-system is not recognized any more and the system
> hangs without any kernel-messages. Is it a problem, when i have on a 2940UW
> a SCSI-II-Raid ?

Sorry, I'm not well up in problems involving Adaptec controllers.

        Gerard.


home help back first fref pref prev next nref lref last post