[1063] in linux-scsi channel archive

home help back first fref pref prev next nref lref last post

persistent errors, help!

daemon@ATHENA.MIT.EDU (Jussi Larjo)
Wed Dec 4 06:51:19 1996

To: linux-scsi@vger.rutgers.edu
Reply-To: larjo@cc.tut.fi
Date: 	Wed, 04 Dec 1996 13:49:19 +0200
From: Jussi Larjo <larjo@solifer.ee.tut.fi>



	Hello there.

I'm having continuous problems with my system's disks. 

System:
P100, 16M, RedHat 4.0 with 2.0.25 kernel.
AHA1542CF, disks Seagate Hawk 1GB and Micropolis Aries 2GB + 
Philips CDD-2000 CD-R drive + HP 2GB DAT.

1. Apparently locate, which is run nightly by cron, has problems with the 
Seagate. 

- log transcript -
Dec  3 01:02:02 norton kernel: EXT2-fs error (device 08:04): ext2_readdir: bad 
entry in directory #82: rec_len %  != 0 - offset=0, inode=0, rec_len=37, 
name_len=0
Dec  3 01:02:31 norton kernel: attempt to access beyond end of device
Dec  3 01:02:31 norton kernel: 08:01: rw=0, want=442616, limit=342000

<last two lines repeated with different 'want' entries a dozen times>

Dec  3 01:02:31 norton kernel: Filesystem panic (dev 08:01, mounted on 
08:04:35126)
Dec  3 01:02:31 norton kernel:   FAT error
Dec  3 01:02:31 norton kernel: Directory 3242971: bad FAT
Dec  3 01:02:31 norton kernel: Filesystem panic (dev 08:01, mounted on 
08:04:35126)
Dec  3 01:02:31 norton kernel:   fat_free: deleting beyond EOF
- end of transcript -

It routinely runs fsck at every boot, usually with no complaints.
08:04 is my root partition, 08:01 is a DOS partition, both on the Seagate.

2. Then yesterday, while operating the CD-R we got some real trouble:

-log transcript -
Dec  3 13:57:22 norton kernel: scsi : aborting command due to timeout : pid 
283798, scsi0, channel 0, id 0, lun 0 Read (6) 17 cc 26 02 00 
Dec  3 13:57:23 norton kernel: scsi : aborting command due to timeout : pid 
283798, scsi0, channel 0, id 0, lun 0 Read (6) 17 cc 26 02 00 
Dec  3 13:57:23 norton kernel: SCSI host 0 abort (pid 283798) timed out - 
resetting
Dec  3 13:57:23 norton kernel: SCSI bus is being reset for host 0 channel 0.
Dec  3 13:57:23 norton kernel: aha1542_out failed(2): Sent BUS DEVICE RESET to 
target 0
Dec  3 13:57:23 norton kernel: Sending DID_RESET for target 0
Dec  3 13:57:23 norton kernel: aha1542_out failed(2): Sending DID_RESET for 
target 0
Dec  3 13:57:23 norton last message repeated 5 times
Dec  3 13:57:23 norton kernel: aha1542_out failed(2): scsi : aborting command 
due to timeout : pid 283798, scsi0, channel 0, id 0, lun 0 Read (6) 17 cc 26 
02 00 
Dec  3 13:57:23 norton kernel: SCSI host 0 abort (pid 283798) timed out - 
resetting
Dec  3 13:57:23 norton kernel: SCSI bus is being reset for host 0 channel 0.
Dec  3 13:57:23 norton kernel: aha1542_out failed(2): Sent BUS DEVICE RESET to 
target 0
Dec  3 13:57:23 norton kernel: Sending DID_RESET for target 0
Dec  3 13:57:23 norton kernel: aha1542_out failed(2): Sending DID_RESET for 
target 0
Dec  3 13:57:23 norton last message repeated 5 times
Dec  3 13:57:23 norton kernel: aha1542_out failed(2): scsi : aborting command 
due to timeout : pid 283798, scsi0, channel 0, id 0, lun 0 Read (6) 17 cc 26 
02 00 
Dec  3 13:57:23 norton kernel: SCSI host 0 abort (pid 283798) timed out - 
resetting
Dec  3 13:57:23 norton kernel: SCSI bus is being reset for host 0 channel 0.
Dec  3 13:57:23 norton kernel: aha1542_out failed(2): Sent BUS DEVICE RESET to 
target 0
Dec  3 13:57:23 norton kernel: Sending DID_RESET for target 0
Dec  3 13:57:23 norton last message repeated 6 times
Dec  3 13:57:23 norton kernel: aha1542_intr_handle: Unexpected interrupt
Dec  3 13:57:23 norton kernel: tarstat=0, hastat=0 idlun=8 ccb#=2 
Dec  3 13:57:23 norton kernel: aha1542_intr_handle: Unexpected interrupt
Dec  3 13:57:23 norton kernel: tarstat=0, hastat=0 idlun=8 ccb#=3 
Dec  3 13:57:23 norton kernel: aha1542_intr_handle: Unexpected interrupt
Dec  3 13:57:23 norton kernel: tarstat=0, hastat=0 idlun=8 ccb#=4 
Dec  3 13:57:23 norton kernel: aha1542_intr_handle: Unexpected interrupt
Dec  3 13:57:23 norton kernel: tarstat=0, hastat=0 idlun=8 ccb#=5 
Dec  3 13:57:23 norton kernel: aha1542_intr_handle: Unexpected interrupt
Dec  3 13:57:23 norton kernel: tarstat=0, hastat=0 idlun=8 ccb#=6 
Dec  3 13:57:23 norton kernel: aha1542_intr_handle: Unexpected interrupt
Dec  3 13:57:23 norton kernel: tarstat=0, hastat=0 idlun=8 ccb#=7 
- end of transcript -

As the CD-R appeared to work correctly, and we had to finish its job, we did 
not boot the system at that time. Finally, the system crashed just after the 
burning finished. The CD-writer process has realtime operation, so apparently 
it hung somewhere and did not allow any time for other processes.

We know the problems with 1542C models associated with cabling. As the first 
problem is very repeatable, we do not believe that this is the problem here. 
The cables are of recommended type.

Can anybody make suggestions for isolating the problem? The system is up and 
running now, but we really can't rely on it...

	Mr. Jussi Larjo
	Research Associate
	Plasma Technology Laboratory
	Tampere University of Technology
	Finland

	Phone:  +358 3 365 2516
	Fax:    +358 3 365 2090
	E-mail: larjo@cc.tut.fi



home help back first fref pref prev next nref lref last post