[1233] in linux-scsi channel archive


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Re: 2.0.27: aha1740[9]_mbxout wait!

daemon@ATHENA.MIT.EDU (Barrie Spence)
Wed Jan 8 01:27:43 1997

Date: 	Tue, 07 Jan 1997 22:45:14 +0000
From: Barrie Spence <barrie@calvin.demon.co.uk>
To: Michael Weller <eowmob@exp-math.uni-essen.de>
CC: linux-scsi@vger.rutgers.edu

Michael Weller wrote:
> 
> On Sun, 5 Jan 1997, Jon Lewis wrote:
> 
> > On Sun, 5 Jan 1997, Barrie Spence wrote:
> >
> > > Since I upgraded my system from a 486DX2-66 to a P5/120 (x2) with an
> > > ASUS P55T2P4D motherboard, I've been getting these messages. As far as I
> > > can tell, they only occur when the system is idle - never under any
> > > significant load on the disk/controller.
> > >
> > > I believe that this may simply be a timing problem intrduced with the
> > > pentium - as part if the upgrade I ran the board with a single processor
> > > and with the caches disabled - during that time, I don't believe I ever
> > > saw these messages.
> >
> > I ran a P90 news server with a 1740 for some months.  It used to get these
> > messages as well...and was rarely idle.  I ended up wanting multiple SCSI
> > busses, and the 1740 driver seemed to lack support for more than one
> > card...so I went with NCR 810's.
> 
> A historical word:
> 
> The 1740 signals mbxout when it is currently not able to accept an address
> for a new SCSI command descriptor. The address has four bytes which have
> to be transfered one by one; when the address was only partially
> transfered the 1740 signals also mbxout (until all 4 bytes are set). In
> the first version of the 1740 driver this could lead to a race condition
> where two processess accessed the 1740 at the same time leading to a
> mbxout deadlock when one wrote only part of the address.
> 
> This problem was solved long ago; due to Adaptec the CPU will never see
> mbxout_wait otherwise because the 1740 is so incredible fast and will
> handle a new SCSI command descriptor so fast that the CPU can't ever catch
> it with mbxout_wait set.
> 
> Today:
> 
> When I last looked in the 1740 driver, in the (impossible) case it catches
> it with mbxout_wait it just spits out this warning message and does not do
> much about it. However, there is a problem with printk reenabling
> interrupts and opening the possibility for a race condition in this critical
> area of the 1740 driver.
> 
> So my question: Apart from this (warning) message, does your system
> continue to run, or does it lock up instantly (what could be a side
> effect of the printk)?

It appears to run fine - uptime was almost 20 days when I rebooted for a
new
kernel config - I've pushed it hard with lots of disk activity during
that
time, but as I said in my first post the messages are usually logged
when the
system is unattended (idle) - from today, while I was at work:

Jan  7 09:08:22 calvin kernel: aha1740[24]_mbxout wait!
Jan  7 11:48:22 calvin kernel: aha1740[11]_mbxout wait!
Jan  7 13:08:21 calvin kernel: aha1740[16]_mbxout wait!
Jan  7 13:18:23 calvin kernel: aha1740[12]_mbxout wait!
Jan  7 13:28:24 calvin kernel: aha1740[10]_mbxout wait!
Jan  7 14:38:22 calvin kernel: aha1740[30]_mbxout wait!
Jan  7 17:08:21 calvin kernel: aha1740[30]_mbxout wait!
Jan  7 17:28:23 calvin kernel: aha1740[22]_mbxout wait!
Jan  7 17:38:24 calvin kernel: aha1740[24]_mbxout wait!
Jan  7 18:28:21 calvin kernel: aha1740[2]_mbxout wait!
Jan  7 18:58:24 calvin kernel: aha1740[4]_mbxout wait!

> I would suspect that your setup is just too fast and sees the impossible
> (mbxout still busy). I'd assume a tiny patch to make it loop a few (more)
> tiny times until mbxout is no longer busy is all you need.

Yes, I'm running with the processor and L2 cache disabled just now - no 
messages.
 
> What's your kernel version? I could try getting the same (I'm a bit out
> of sync with current ALPHA versions) and providing another tiny patch to the
> 174x driver.

Stock 2.0.27 (RedHat devel rpm). A patch would be greatly appreciated.

FYI, from dmesg:

Configuring Adaptec at IO:1c80, IRQ 11
EATA0: address 0x1c88 in use, skipping probe.
EATA0: rev. 2.0B, EISA, PORT 0x3c88, IRQ 15, DMA 255, SG 64, Mbox 64,
CmdLun 2.
EATA0: SCSI channel 0 enabled, host target ID 7.
EATA1: address 0x330 in use, skipping probe.
EATA/DMA 2.0x: Copyright (C) 1994, 1995, 1996 Dario Ballabio.
scsi0 : Adaptec 174x (EISA)
scsi1 : EATA/DMA 2.0x rev. 2.30.00 
scsi : 2 hosts.
  Vendor: DEC       Model: DSP3160S          Rev: T427
  Type:   Direct-Access                      ANSI SCSI revision: 02
Detected scsi disk sda at scsi0, channel 0, id 0, lun 0
  Vendor: CONNER    Model: CFP2107S  2.14GB  Rev: 2B4B
  Type:   Direct-Access                      ANSI SCSI revision: 02
Detected scsi disk sdb at scsi0, channel 0, id 2, lun 0
  Vendor: HP        Model: HP35470A          Rev: 1009
  Type:   Sequential-Access                  ANSI SCSI revision: 02
Detected scsi tape st0 at scsi0, channel 0, id 4, lun 0
  Vendor: TOSHIBA   Model: CD-ROM XM-3401TA  Rev: 0283
  Type:   CD-ROM                             ANSI SCSI revision: 02
Detected scsi CD-ROM sr0 at scsi0, channel 0, id 6, lun 0
  Vendor: DEC       Model: DSP3210S          Rev: X441
  Type:   Direct-Access                      ANSI SCSI revision: 02
Detected scsi disk sdc at scsi1, channel 0, id 1, lun 0
  Vendor: HP        Model: HP35470A          Rev: 7 09
  Type:   Sequential-Access                  ANSI SCSI revision: 02
Detected scsi tape st1 at scsi1, channel 0, id 3, lun 0
  Vendor: IMS       Model: CDD521/10         Rev: 2.04
  Type:   WORM                               ANSI SCSI revision: 01
Detected scsi CD-ROM sr1 at scsi1, channel 0, id 4, lun 0
  Vendor: SONY      Model: CD-ROM CDU-8012   Rev: 3.1a
  Type:   CD-ROM                             ANSI SCSI revision: 02
Detected scsi CD-ROM sr2 at scsi1, channel 0, id 5, lun 0
scsi : detected 2 SCSI tapes 3 SCSI cdroms 3 SCSI disks total.
 
Thanks,
	Barrie
-- 
Barrie Spence			Sanity Clause? There is no Sanity Clause
Home: barrie@calvin.demon.co.uk		Telephone +44 1506 442304
Play: barrie@sqf.hp.com			Telephone +44 131 331 7103

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[1233] in linux-scsi channel archive

Re: 2.0.27: aha1740[9]_mbxout wait!

daemon@ATHENA.MIT.EDU (Barrie Spence)Wed Jan 8 01:27:43 1997

daemon@ATHENA.MIT.EDU (Barrie Spence)
Wed Jan 8 01:27:43 1997