[526] in linux-scsi channel archive
PLEASE READ!! -- frustrating EISA-bus / SCSI-controller problems
daemon@ATHENA.MIT.EDU (Jarrett Redd)
Sat Aug 19 06:10:26 1995
Date: Sat, 19 Aug 1995 00:19:59 -0500
From: Jarrett Redd <redd@pat.mdc.com>
To: linux-scsi@vger.rutgers.edu
PLEASE!!! Can I bend everyone's ear for 5 minutes about a VERY PECULIAR
situation with a particular DTC3292 controller problem? I am not a lazy,
FAQ-asking person looking for an easy answer. I am an engineer with
McDonnell-Douglas who has been struggling to install Linux on several machines.
I have read EVERY SINGLE PIECE OF FAQ, HOWTO, README, etc. that I could find,
and I have got it running on every machine except for one FRUSTRATING one!!
I have already sent this to Drew Eckhardt at cs.colorado.edu and his responses
are included within this posting as well as my comments. Thanks, Drew!
I can provide you with a detailed explanation of everything I have done, and
it is extensive! But I'll summarize to save you some time (and grief):
The system is: EISA bus 486dx2-66, with a DTC3292 SCSI controller
Two SCSI drives (Quantum 245meg, Sequel 4gig)
The docs say the DTC controller is supported as compatible with an AHA1542 and
it is indeed detected as such. The two SCSI hard-drives are also both detected
correctly. The problem is that in addition, non-existent drives are also
detected at every SCSI ID. Wait!! I know the SCSI-HOWTO says that this is
because a device is strapped at the same address as the controller. But they
are not! I have tried virtually all possible ID configurations, but nothing
will work.
As the boot proceeds, the two REAL drives are detected with correct sector
sizes of 512 bytes, but the FAKE drives are detected with sector size zero.
Fine, so these devices are deleted from the link-list built in scan_scsis().
However, when it comes time to do the partition check, they are still present
as NULL, because the check passes fine on the two real devices and gives a
kernel panic on the first nonexistent one:
kernel panic: no device passed to allocate_device()
So they are still in the list as NULL scsi ptrs? Anyway, in desperation, I
moved my drives to ID's 0 and 1, and controller to 2, hacked the kernel to deny
any ID's greater than 2 (in the for dev=1 to <8 loop) and now it boots up fine.
However, after the install, when I try to reboot, it has trouble mounting the
root partition and init complains that it cannot locate /bin/libc.so.4. If I
boot up with a floppy and manually mount the root partition, that file is there
right where it should be!! However, if I poke around in the file system, I
inevitably come across invalid directory entries and half-complete files.
Basically, the file system is swiss cheese!
So is the controller not REALLY talking to the drives quite right?
de> Probably; although it's probably not talking to the system right. Looks
de> like a cache coherency or arbitration problem. If your system is setup
de> with a write-back cache, try write through; disabling the cache may also
de> work.
jr> changed write-back to write-thru in the E-CMOS setup... so far no luck!
They format correctly across the entire disk. In fact, they mount and most
files seems fine and will execute. Almost seems like a termination or cabling
problem but the drives work flawlessly under MS-DOS.
Well, after two weeks of struggling, I am officially out of ideas! Are there
any suggestions that you can send my way? I can give more information if you
need it. Also, can you please PASS THIS ON to anyone who you think can help
me out?
de> You can forward your question and my response to linux-scsi@vger.rutgers.
de> edu and maybe linux-kernel@vger.rutgers.edu (I don't think this is a SCSI
de> problem) if you want; there may be other people attempting to run EISA
de> bus masters in the same make/model of mainboard (although vendors tend to
de> upgrade after chipset bugs are fixed, so they might not have the same
de> system).
jr> main board chip set is HiNT.
Another thing that may or may not be important: The floppy drives are hanging
off the DTC controller since it supports it. Every so often, the system will
lock up when the floppies and hard-drive are both active (such as during the
install). This never happens under MS-DOS either.
de> This suggests an arbitration problem. You may be able to work arround it
de> by playing with bus on/off times, etc (could be set in the Adaptec driver,
de> maybe with jumpers), maybe not. MS-DOS doesn't do data transfers to the
de> floppy drives and hard disks at the same time. Linux does.
jr> the DTC3292 has no jumpers on board... it's all software configured and
jr> there are limited options to choose from. no bus stuff at all.
HELP?! Again, thanks so much for your time and attention. I know you are
probably incredibly busy. I greatly appreciate your help.
sincerely,
Jarrett L. Redd
redd@pat.mdc.com
---
Orbital Systems Engineer
McDonnell-Douglas Aerospace
Space Station Division
Houston, Texas