[19837] in Athena Bugs
creating unref dirs that give indigestion to Solaris8 "fsck -o p"
daemon@ATHENA.MIT.EDU (Tom Yu)
Tue Sep 25 05:34:51 2001
To: bugs@mit.edu
From: Tom Yu <tlyu@MIT.EDU>
Date: 25 Sep 2001 05:34:48 -0400
Message-ID: <ldvu1xr8vgn.fsf@saint-elmos-fire.mit.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Below, I will document what I believe to be a rather repeatable
technique for the creation of unreferenced directories under Solaris
8. These unreferenced directories have many of the same properties as
the unreferenced /tmp that tends to prevent the boot-time automatic
fsck ("/usr/sbin/fsck -o p") from succeeding on the root partition of
an Athena 9.0.x Sun that has crashed hard.
[first make sure ufs logging is turned off]
/bin/sh
mkdir /foo
(cd /foo && sleep 9999)&
rm -rf /foo
fsck -n /
[we expect an unref inode due to the cwd of the sleep process]
for i in /proc/*; do (cd $i/cwd && ls -ldi .); done | grep $bogus_inode
[where $bogus_inode is the inode# of the unref dir found by fsck]
kill $pid
[where $pid is $! or whatever the pid of the sleep process is]
sync; sync; sync
fsck -n /
[this will still show an unref dir!]
for i in /proc/*; do (cd $i/cwd && ls -ldi .); done | grep $bogus_inode
[note this will fail to turn up any proc w/cwd = inode# of unref dir]
--------------------
I will note that examining in-kernel structures with /usr/sbin/crash
reveals a bogus refcount of 1 for the directories in question, while
/usr/sbin/fsdb shows a link count of zero, even though there is a "."
link to itself, and a ".." link to the root. This, of course, will
also result in "fsck -n" noting that the link count for "/" should be
higher than it is on-disk.
I am fairly convinced that this is some subtle bug in the refcount
management for ufs inodes. While it is the case that turning on ufs
logging will prevent the symptoms, I fear that it masks the underlying
problem rather than solves it. This does not give me a warm and fuzzy
feeling about the Solaris 8 ufs implementation.
Anyway, if people need, I can provide additional data such as
transcripts of more elaborate versions of the procedure above, showing
details of output from fsdb and crash.
---Tom