[630] in NetBSD-Development
squeamish-ossifrage
daemon@ATHENA.MIT.EDU (Charles M. Hannum)
Tue Mar 7 07:42:28 1995
Date: Tue, 7 Mar 1995 07:41:59 -0500
From: "Charles M. Hannum" <mycroft@ai.mit.edu>
To: netbsd-dev@MIT.EDU
So, yesterday I looked at squeamish-ossifrage. The symptoms of its
problem are:
1) when I boot from a floppy, I can mount and access the SCSI disk
fine,
2) when I boot from the SCSI disk, it fails, saying:
sd0: mode sense (4) returned nonsense; using fictitious geometry
and then doesn't manage to mount root, and
3) when I boot from a floppy and tell it that the SCSI disk is root,
it does the same thing as in 2.
Upon further investigation, I found that indeed the mode sense is
returning garbage. Er, wait, the mode sense isn't returning
*anything*; the buffer is completely unmodified.
Or, more, precisely, with the buffer at some addresses (the set of
which I haven't tried to characterize yet), the data actually gets
stored at that address, but with the 1k bit in the address toggled.
This is not because the driver is putting the wrong address in the
CCB; I've checked that quite thoroughly.
Now, when you mountroot(), you get a different buffer address than if
you mount() later. The former was experiencing this error, but the
latter was not.
There are a few things that might cause such a problem:
1) loose contacts,
2) bad RAM or cache chip causing the CCB bits to be corrupted,
3) flaky storage somewhere on the board,
4) the phase of the moon is wrong, or
5) broken PCI chipset.
Of course, these might also affect Linux, but since Linux undoubtedly
has different usage patterns, it's hard to say. It could just be
`getting lucky'.
At this point, I'm pretty convinced something weird is happening at a
hardware level, but not being a hardware geek, I don't have a clue
what it is. There are at least a couple of things I can think of that
would induce such a symptom through software, but not quite in the way
this seems to be occuring, and they assume specific designs of the
SCSI board.
So, does anyone have a suggestion on how to track this down? I'd
rather not send in the logic probe.