[2373] in linux-scsi channel archive
SCSI Problems
daemon@ATHENA.MIT.EDU (Justin Brown W.)
Fri Aug 29 01:03:55 1997
Date: Thu, 28 Aug 1997 16:33:09 -0500 (CDT)
From: "Justin Brown W." <jbrown@tanstaafl.busprod.com>
To: linux-scsi@vger.rutgers.edu
Summary: HELP!: DPT2044U/4040 Smartcache4 Raid SCSI problems
Keywords: SCSI,Raid,Help
Recently we installed a 3 drive Raid 5 consisting of 3 Seagate ST15230N
(Hawk) drives (4.1 gigs apiece) being controlled by a DPT SmartCach IV 2044U
with a Raid RC4040 module. This raid was installed in a PPro 200 128 megs of
ram and an additional DPT 2124 SCSI controller. Under Linux 2.0.30, using
the EATA-DMA driver, the kernel would find the raid, and allow access.
Fdisk followed by formatting would ultimately lead to a total lockup of the
machine, but not before excessive SCSI errors. Here is an excerpt of the
log that contains the most clear examples of what happened (I've chopped out
the name daemon and sendmail log entries):
Aug 27 19:44:34 shell linux: Console: 8 point font, 400 scans
Aug 27 19:44:35 shell linux: Console: colour VGA+ 80x50, 1 virtual console (max 63)
Aug 27 19:44:35 shell linux: pcibios_init : BIOS32 Service Directory structure at 0x000fad00
Aug 27 19:44:35 shell linux: pcibios_init : BIOS32 Service Directory entry at 0xfb180
Aug 27 19:44:36 shell linux: pcibios_init : PCI BIOS revision 2.10 entry at 0xfb1b0
Aug 27 19:44:36 shell linux: Probing PCI hardware.
Aug 27 19:44:36 shell linux: Calibrating delay loop.. ok - 198.66 BogoMIPS
Aug 27 19:44:36 shell linux: Memory: 125820k/129024k available (736k kernel code, 384k reserved, 2084k data)
Aug 27 19:44:36 shell linux: Swansea University Computer Society NET3.035 for Linux 2.0
Aug 27 19:44:36 shell linux: NET3: Unix domain sockets 0.13 for Linux NET3.035.
Aug 27 19:44:37 shell linux: Swansea University Computer Society TCP/IP for NET3.034
Aug 27 19:44:37 shell linux: IP Protocols: IGMP, ICMP, UDP, TCP
Aug 27 19:44:37 shell linux: Swansea University Computer Society IPX 0.34 for NET3.035
Aug 27 19:44:37 shell linux: IPX Portions Copyright (c) 1995 Caldera, Inc.
Aug 27 19:44:37 shell linux: VFS: Diskquotas version dquot_5.6.0 initialized
Aug 27 19:44:37 shell linux: Checking 386/387 coupling... Ok, fpu using exception 16 error reporting.
Aug 27 19:44:38 shell linux: Checking 'hlt' instruction... Ok.
Aug 27 19:44:38 shell linux: Linux version 2.0.30 (root@shell) (gcc version 2.7.2.1) #12 Wed Aug 27 19:35:34 CDT 1997
Aug 27 19:44:38 shell linux: Serial driver version 4.13 with no serial options enabled
Aug 27 19:44:38 shell linux: tty00 at 0x03f8 (irq = 4) is a 16550A
Aug 27 19:44:38 shell linux: tty01 at 0x02f8 (irq = 3) is a 16550A
Aug 27 19:44:38 shell linux: lp1 at 0x0378, (polling)
Aug 27 19:44:39 shell linux: Ramdisk driver initialized : 16 ramdisks of 0K size
Aug 27 19:44:39 shell linux: ide: i82371 PIIX (Triton) on PCI bus 0 function 57
Aug 27 19:44:39 shell linux: ide: neither port is enabled
Aug 27 19:44:39 shell linux: Floppy drive(s): fd0 is 1.44M
Aug 27 19:44:39 shell linux: Started kswapd v 1.4.2.2
Aug 27 19:44:40 shell linux: FDC 0 is a post-1991 82077
Aug 27 19:44:40 shell linux: EATA (Extended Attachment) driver version: 2.59b
Aug 27 19:44:40 shell linux: developed in co-operation with DPT
Aug 27 19:44:40 shell linux: (c) 1993-96 Michael Neuffer, mike@i-Connect.Net
Aug 27 19:44:40 shell linux: Registered HBAs:
Aug 27 19:44:41 shell linux: HBA no. Boardtype Revis EATA Bus BaseIO IRQ DMA Ch ID Pr QS S/G IS
Aug 27 19:44:41 shell linux: scsi0 : PM2044U v07K.V 2.0c PCI 0x8010 11 BMST 1 7 N 64 252 Y
Aug 27 19:44:41 shell linux: scsi1 : PM2124A/9X v07C.0 2.0c PCI 0x8110 15 BMST 1 7 N 64 64 N
Aug 27 19:44:41 shell linux: scsi0 : EATA (Extended Attachment) HBA driver
Aug 27 19:44:41 shell linux: scsi1 : EATA (Extended Attachment) HBA driver
Aug 27 19:44:41 shell linux: scsi : 2 hosts.
Aug 27 19:44:42 shell linux: Vendor: DPT Model: RAID-5 Rev: 07KV
Aug 27 19:44:42 shell linux: Type: Direct-Access ANSI SCSI revision: 02
Aug 27 19:44:42 shell linux: Detected scsi disk sda at scsi0, channel 0, id 4, lun 0
Aug 27 19:44:42 shell linux: scsi0: queue depth for target 4 on channel 0 set to 64
Aug 27 19:44:42 shell linux: Vendor: FUJITSU Model: M2694ES-512 Rev: 812A
Aug 27 19:44:42 shell linux: Type: Direct-Access ANSI SCSI revision: 01 CCS
Aug 27 19:44:43 shell linux: Detected scsi disk sdb at scsi1, channel 0, id 0, lun 0
Aug 27 19:44:43 shell linux: scsi1: queue depth for target 0 on channel 0 set to 64
Aug 27 19:44:43 shell linux: scsi : detected 2 SCSI disks total.
Aug 27 19:44:43 shell linux: SCSI device sda: hdwr sector= 512 bytes. Sectors= 16772096 [8189 MB] [8.2 GB]
Aug 27 19:44:43 shell linux: SCSI device sdb: hdwr sector= 512 bytes. Sectors= 2117024 [1033 MB] [1.0 GB]
Aug 27 19:44:44 shell linux: wd.c:v1.10 9/23/94 Donald Becker (becker@cesdis.gsfc.nasa.gov)
Aug 27 19:44:44 shell linux: eth0: WD80x3 at 0x280, 00 00 C0 5C 90 A4 WD8013, IRQ 3, shared memory at 0xd0000-0xd3fff.
Aug 27 19:44:44 shell linux: Partition check:
Aug 27 19:44:44 shell linux: sda: sda1 sda2 sda3 sda4
Aug 27 19:44:44 shell linux: sdb: sdb1 sdb2 sdb3 sdb4
Aug 27 19:44:45 shell linux: JAVA Binary support v1.01 for Linux 1.3.98 (C)1996 Brian A. Lantz
Aug 27 19:44:45 shell linux: VFS: Mounted root (ext2 filesystem) readonly.
Aug 27 19:44:45 shell linux: Adding Swap: 67532k swap-space (priority 2144)
Aug 27 19:44:45 shell kernel: Kernel logging (proc) started.
Aug 27 19:44:58 shell login[105]: ROOT LOGIN ON TTY `tty1'
Aug 27 19:46:50 shell login[106]: ROOT LOGIN ON TTY `tty2'
Aug 27 19:47:10 shell linux: scsi : aborting command due to timeout : pid 10522, scsi0, channel 0, id 4, lun 0 Write (6) 0a 0f 98 f4 00
Aug 27 19:47:10 shell linux: eata_abort called pid: 10522 target: 4 lun: 0 reason 3
Aug 27 19:47:10 shell linux: Returning: SCSI_ABORT_BUSY
Aug 27 19:47:10 shell kernel: scsi : aborting command due to timeout : pid 10524, scsi0, channel 0, id 4, lun 0 Write (6) 0a 11 80 f4 00
Aug 27 19:47:11 shell son 3
Aug 27 19:47:11 shell 11 kernel: eata_abort called pid: 10619 target: 4 lun: 0 reason 3
Aug 27 19:47:11 shell kernel: scsi : aborting command due to timeout : pid 10624, scsi0, channel 0, id 4, lun 0 Write (6) 0a 1c e8 f4 00
Aug 27 19:47:11 shell kernel: scsi : aborting command due to timeout : pid 10626, scsi0, channel 0, id 4, lun 0 Write (6) 0a 1d dc f4 00
Aug 27 19:47:11 shell kernel: scsi : aborting command due to timeout : pid 10640, scsi0, channel 0, id 4, lun 0 Write (6) 0a 1e d0 f4 00
Aug 27 19:47:11 shell kernel: scsi : aborting command due to timeout : pid 10648, scsi0, channel 0, id 4, lun 0 Write (6) 0a 1f c4 f4 00
Aug 27 19:47:11 shell kernel: scsi : aborting command due to timeout : pid 10651, scsi0, channel 0, id 4, lun 0 Write (6) 0a 20 b8 f4 00
Aug 27 19:47:12 shell kernel: scsi : aborting command due to timeout : pid 10652, scsi0, channel 0, id 4, lun 0 Write (6) 0a 21 ac f4 00
Aug 27 19:47:12 shell kernel: scsi : aborting command due to timeout : pid 10653, scsi0, channel 0, id 4, lun 0 Write (6) 0a 22 a0 f4 00
Aug 27 19:47:12 shell kernel: scsi : aborting command due to timeout : pid 10654, scsi0, channel 0, id 4, lun 0 Write (6) 0a 23 94 f4 00
Aug 27 19:47:12 shell kernel: scsi : aborting command due to timeout : pid 10655, scsi0, channel 0, id 4, lun 0 Write (6) 0a 24 88 f4 00
Aug 27 19:47:12 shell kernel: scsi : aborting command due to timeout : pid 10656, scsi0, channel 0, id 4, lun 0 Write (6) 0a 25 7c f4 00
Aug 27 19:47:12 shell kernel: scsi : aborting command due to timeout : pid 10657, scsi0, channel 0, id 4, lun 0 Write (6) 0a 26 70 f4 00
Aug 27 19:47:13 shell kernel: scsi : aborting command due to timeout : pid 10658, scsi0, channel 0, id 4, lun 0 Write (6) 0a 27 64 f4 00
Aug 27 19:47:13 shell kernel: scsi : aborting command due to timeout : pid 10659, scsi0, channel 0, id 4, lun 0 Write (6) 0a 28 58 f4 00
Aug 27 19:47:13 shell kernel: scsi : aborting command due to timeout : pid 10660, scsi0, channel 0, id 4, lun 0 Write (6) 0a 29 4c f4 00
Aug 27 19:47:13 shell kernel: scsi : aborting command due to timeout : pid 10661, scsi0, channel 0, id 4, lun 0 Write (6) 0a 2a 40 f4 00
Aug 27 19:47:13 shell kernel: scsi : aborting command due to timeout : pid 10662, scsi0, channel 0, id 4, lun 0 Write (6) 0a 2b 34 f4 00
Aug 27 19:47:13 shell kernel: scsi : aborting command due to timeout : pid 10663, scsi0, channel 0, id 4, lun 0 Write (6) 0a 2c 28 f4 00
Aug 27 19:47:13 shell kernel: scsi : aborting command due to timeout : pid 10664, scsi0, channel 0, id 4, lun 0 Write (6) 0a 2d 1c f4 00
Aug 27 19:47:14 shell kernel: scsi : aborting command due to timeout : pid 10666, scsi0, channel 0, id 4, lun 0 Write (6) 0a 2e 10 f4 00
Aug 27 19:47:14 shell kernel: scsi : aborting command due to timeout : pid 10668, scsi0, channel 0, id 4, lun 0 Write (6) 0a 2f f8 f4 00
Aug 27 19:47:14 shell kernel: scsi : aborting command due to timeout : pid 10669, scsi0, channel 0, id 4, lun 0 Write (6) 0a 30 ec f4 00
Aug 27 19:47:14 shell kernel: scsi : aborting command due to timeout : pid 10670, scsi0, channel 0, id 4, lun 0 Write (6) 0a 31 e0 f4 00
Aug 27 19:47:14 shell kernel: scsi : aborting command due to timeout : pid 10690, scsi0, channel 0, id 4, lun 0 Write (6) 0a 32 d4 f4 00
Aug 27 19:47:15 shell kernel: scsi : aborting command due to timeout : pid 10691, scsi0, channel 0, id 4, lun 0 Write (6) 0a 33 c8 f4 00
Aug 27 19:47:15 shell kernel: scsi : aborting command due to timeout : pid 10692, scsi0, channel 0, id 4, lun 0 Write (6) 0a 34 bc f4 00
Aug 27 19:47:15 shell kernel: scsi : aborting command due to timeout : pid 10693, scsi0, channel 0, id 4, lun 0 Write (6) 0a 35 b0 f4 00
Aug 27 19:47:15 shell kernel: scsi : aborting command due to timeout : pid 10694, scsi0, channel 0, id 4, lun 0 Write (6) 0a 36 a4 f4 00
Aug 27 19:47:16 shell kernel: scsi : aborting command due to timeout : pid 10696, scsi0, channel 0, id 4, lun 0 Write (6) 04 18 02 04 00
Aug 27 19:47:16 shell kernel: scsi : aborting command due to timeout : pid 10697, scsi0, channel 0, id 4, lun 0 Write (6) 04 18 10 02 00
Aug 27 19:47:16 shell kernel: scsi : aborting command due to timeout : pid 10698, scsi0, channel 0, id 4, lun 0 Write (6) 04 1a 38 02 00
Aug 27 19:47:17 shell kernel: scsi : aborting command due to timeout : pid 10699, scsi0, channel 0, id 4, lun 0 Write (6) 04 58 18 10 00
Aug 27 19:47:17 shell kernel: scsi : aborting command due to timeout : pid 10700, scsi0, channel 0, id 4, lun 0 Write (6) 05 98 24 0a 00
Aug 27 19:47:17 shell kernel: scsi : aborting command due to timeout : pid 10701, scsi0, channel 0, id 4, lun 0 Write (6) 05 9c ae f4 00
Aug 27 19:47:17 shell kernel: scsi : aborting command due to timeout : pid 10702, scsi0, channel 0, id 4, lun 0 Write (6) 05 9d a2 f4 00
Aug 27 19:47:18 shell kernel: scsi : aborting command due to timeout : pid 10703, scsi0, channel 0, id 4, lun 0 Write (6) 05 9e 96 f4 00
Aug 27 19:47:18 shell kernel: scsi : aborting command due to timeout : pid 10704, scsi0, channel 0, id 4, lun 0 Write (6) 05 9f 8a f4 00
Aug 27 19:47:18 shell kernel: scsi : aborting command due to timeout : pid 10705, scsi0, channel 0, id 4, lun 0 Write (6) 05 a0 7e f4 00
Aug 27 19:47:18 shell kernel: scsi : aborting command due to timeout : pid 10706, scsi0, channel 0, id 4, lun 0 Write (6) 05 a1 72 52 00
Aug 27 19:47:18 shell kernel: scsi : aborting command due to timeout : pid 10707, scsi0, channel 0, id 4, lun 0 Write (6) 0a 58 0c 06 00
Aug 27 19:47:19 shell kernel: scsi : aborting command due to timeout : pid 10708, scsi0, channel 0, id 4, lun 0 Write (6) 0a 5a 0c 10 00
Aug 27 19:47:19 shell kernel: scsi : aborting command due to timeout : pid 10709, scsi0, channel 0, id 4, lun 0 Write (6) 0a 98 0c 0c 00
Aug 27 19:48:06 shell linux: scsi : aborting command due to timeout : pid 11944, scsi0, channel 0, id 4, lun 0 Write (6) 15 a8 5a f4 00
Aug 27 19:48:08 shell linux: eata_abort called pid: 11944 target: 4 lun: 0 reason 3
Aug 27 19:48:14 shell linux: Returning: SCSI_ABORT_BUSY
Aug 27 19:48:07 shell kernel: scsi : aborting command due to timeout : pid 11946, scsi0, channel 0, id 4, lun 0 Write (6) 15 aa 42 f4 00
Aug 27 19:48:14 shell son 3
Aug 27 19:48:14 shell 08 kernel: eata_abort called pid: 11957 target: 4 lun: 0 reason 3
Aug 27 19:48:14 shell el 0, id 4, lun 0 Write (6) 15 ba 76 f4 00
Aug 27 19:48:14 shell : aborting command due to timeout : pid 11969, scsi0, channel 0, id 4, lun 0 Write (6) 15 c0 2e f4 00
Aug 27 19:48:14 shell : Returning: SCSI_ABORT_BUSY
Aug 27 19:48:14 shell d: 11981 target: 4 lun: 0 reason 3
Aug 27 19:48:14 shell d1 56 f4 00
Aug 27 19:48:14 shell kernel: scsi : aborting command due to timeout : pid 11990, scsi0, channel 0, id 4, lun 0 Write (6) 15 d4 32 f4 00
Aug 27 19:48:15 shell kernel: scsi : aborting command due to timeout : pid 11993, scsi0, channel 0, id 4, lun 0 Write (6) 15 d7 0e f4 00
Aug 27 19:48:15 shell kernel: scsi : aborting command due to timeout : pid 11994, scsi0, channel 0, id 4, lun 0 Write (6) 15 da 1e f4 00
Aug 27 19:48:15 shell kernel: scsi : aborting command due to timeout : pid 11998, scsi0, channel 0, id 4, lun 0 Write (6) 15 dd ee f4 00
Aug 27 19:48:16 shell kernel: scsi : aborting command due to timeout : pid 11999, scsi0, channel 0, id 4, lun 0 Write (6) 15 de e2 f4 00
Aug 27 19:48:16 shell kernel: scsi : aborting command due to timeout : pid 12000, scsi0, channel 0, id 4, lun 0 Write (6) 15 df d6 f4 00
Aug 27 19:48:16 shell kernel: scsi : aborting command due to timeout : pid 12001, scsi0, channel 0, id 4, lun 0 Write (6) 15 e0 ca f4 00
Aug 27 19:48:16 shell kernel: scsi : aborting command due to timeout : pid 12002, scsi0, channel 0, id 4, lun 0 Write (6) 15 e1 be f4 00
Aug 27 19:48:16 shell kernel: scsi : aborting command due to timeout : pid 12003, scsi0, channel 0, id 4, lun 0 Write (6) 15 e2 b2 f4 00
Aug 27 19:48:17 shell kernel: scsi : aborting command due to timeout : pid 12004, scsi0, channel 0, id 4, lun 0 Write (6) 15 e3 a6 f4 00
Aug 27 19:48:18 shell kernel: scsi : aborting command due to timeout : pid 12005, scsi0, channel 0, id 4, lun 0 Write (6) 15 e4 9a f4 00
Aug 27 19:48:18 shell kernel: scsi : aborting command due to timeout : pid 12006, scsi0, channel 0, id 4, lun 0 Write (6) 15 e5 8e f4 00
Aug 27 19:48:19 shell kernel: scsi : aborting command due to timeout : pid 12007, scsi0, channel 0, id 4, lun 0 Write (6) 15 e6 82 f4 00
Aug 27 19:48:19 shell kernel: scsi : aborting command due to timeout : pid 12008, scsi0, channel 0, id 4, lun 0 Write (6) 15 e7 76 f4 00
Aug 27 19:48:19 shell kernel: scsi : aborting command due to timeout : pid 12009, scsi0, channel 0, id 4, lun 0 Write (6) 15 e8 6a f4 00
Aug 27 19:48:20 shell kernel: scsi : aborting command due to timeout : pid 12010, scsi0, channel 0, id 4, lun 0 Write (6) 15 e9 5e f4 00
Aug 27 19:48:20 shell kernel: scsi : aborting command due to timeout : pid 12011, scsi0, channel 0, id 4, lun 0 Write (6) 15 ea 52 f4 00
Aug 27 19:48:20 shell kernel: scsi : aborting command due to timeout : pid 12012, scsi0, channel 0, id 4, lun 0 Write (6) 15 eb 46 f4 00
Aug 27 19:48:21 shell kernel: scsi : aborting command due to timeout : pid 12013, scsi0, channel 0, id 4, lun 0 Write (6) 15 ec 3a f4 00
Aug 27 19:48:21 shell kernel: scsi : aborting command due to timeout : pid 12014, scsi0, channel 0, id 4, lun 0 Write (6) 15 ed 2e f4 00
Aug 27 19:48:21 shell kernel: scsi : aborting command due to timeout : pid 12015, scsi0, channel 0, id 4, lun 0 Write (6) 15 ee 22 f4 00
Aug 27 19:48:22 shell kernel: scsi : aborting command due to timeout : pid 12016, scsi0, channel 0, id 4, lun 0 Write (6) 15 ef 16 f4 00
Aug 27 19:48:22 shell kernel: scsi : aborting command due to timeout : pid 12018, scsi0, channel 0, id 4, lun 0 Write (6) 15 f0 0a f4 00
Aug 27 19:48:23 shell kernel: scsi : aborting command due to timeout : pid 12020, scsi0, channel 0, id 4, lun 0 Write (6) 15 f0 fe f4 00
Aug 27 19:48:23 shell kernel: scsi : aborting command due to timeout : pid 12023, scsi0, channel 0, id 4, lun 0 Write (6) 15 f1 f2 f4 00
Aug 27 19:48:23 shell kernel: scsi : aborting command due to timeout : pid 12025, scsi0, channel 0, id 4, lun 0 Write (6) 15 f2 e6 f4 00
Aug 27 19:48:24 shell kernel: scsi : aborting command due to timeout : pid 12029, scsi0, channel 0, id 4, lun 0 Write (6) 15 f5 c2 f4 00
Aug 27 19:56:20 shell linux: scsi : aborting command due to timeout : pid 14584, scsi0, channel 0, id 4, lun 0 Write (6) 0a 1b e6 f4 00
Aug 27 19:56:21 shell linux: eata_abort called pid: 14584 target: 4 lun: 0 reason 3
Aug 27 19:56:26 shell linux: Returning: SCSI_ABORT_BUSY
Aug 27 19:56:20 shell kernel: scsi : aborting command due to timeout : pid 14587, scsi0, channel 0, id 4, lun 0 Write (6) 0a 1d e8 f4 00
Aug 27 19:56:37 shell son 3
Aug 27 19:56:37 shell 22 kernel: eata_abort called pid: 14601 target: 4 lun: 0 reason 3
Aug 27 19:57:06 shell linux: scsi : aborting command due to timeout : pid 15262, scsi0, channel 0, id 4, lun 0 Read (6) 1b 58 0c 02 00
Aug 27 20:00:27 shell syslogd: restart
The machine never panic'd, just locked, usually with garbage on the screen.
This problem didn't occur when formatting small partitions, but partitions
over 500 megs in size would report SCSI_ABORT_BUSY errors, and anything over
a gig would invariably crash. We did pull the other DPT controller but with
or without the 2124 had no discernable affect. We tried between 4 and 64
megs of cache on the controller. We changed the PCI burst rates from the
slowest to the fastest, and the transfer rates from 5mhz to 20mhz, none of
these had any affect, although, the less the cache, the quicker the problem
would begin, and the sooner the crash would occur. Has anyone else
experienced problems of this sort, and does anyone have any suggestions we
might try to fix the problem? (beyond the obvious of buying a totaly
different raid system)
--
+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
| Justin William Brown | "Damnit Jim, I'm a Doctor not a target!" |
jbrown@tanstaafl.busprod.com |
| jbrown@gnu.ai.mit.edu | "What happens when V.R. becomes R.L.?" |
+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+