[17874] in Athena Bugs
Re: Solaris clients with filesystem corruption
daemon@ATHENA.MIT.EDU (Jonathon Weiss)
Tue May 30 18:59:21 2000
Date: Tue, 30 May 2000 18:59:17 -0400 (EDT)
Message-Id: <200005302259.SAA20792@speaker-for-the-dead.mit.edu>
To: Greg Hudson <ghudson@mit.edu>
CC: bugs@mit.edu
In-reply-to: "[17399] in Athena Bugs"
From: Jonathon Weiss <jweiss@MIT.EDU>
We seem to have fixed (or worked around) the first problem, and not
seen the second one again. I'm closing the report.
Jonathon
[17399] daemon@ATHENA.MIT.EDU (Greg Hudson) Athena Bugs 12/01/99 21:41 (29 lines)
Date: Wed, 1 Dec 1999 21:41:14 -0500 (EST)
From: Greg Hudson <ghudson@MIT.EDU>
Today I looked at a couple of machines in the field which had
experienced corruption on the root filesystem, probably after an
unclean shutdown of some kind.
The first machine was an Ultra 5. It had a four-letter name I don't
recall right now. At boot time, it got stuck on "retrying host
configuration", which is a message from /etc/init.d/rootusr. The
cause of the problem was that /etc/hostname.hme0 had been truncated to
zero length. There were a pile of fsck errors on the / and /var
partitions (I would guess also on the /usr partition, but I didn't try
that). Mounting the root partition read-write manually and running
syncconf caused the machine to come up okay.
The second machine was hudson, a Sparc 5. It was booting fine but
xlogin was complaining "workstation failed to activate successfully."
The machine wasn't getting cluster information. I found two
incidences of corruption: /etc/athena/version had a bunch of NUL bytes
appended to it, and /etc/named.conf had been truncated to 0 length.
Lou claims that the first problem is happening very frequently in the
field since the last patch release. I don't know what the patch
release could have to do with it, really. I have no idea what is
causing these (presumed) unclean shutdowns, or why the local
filesystems are experiencing corruption as a result.
--[17399]-- (nref = [17400])