[17874] in Athena Bugs

home help back first fref pref prev next nref lref last post

Re: Solaris clients with filesystem corruption

daemon@ATHENA.MIT.EDU (Jonathon Weiss)
Tue May 30 18:59:21 2000

Date: Tue, 30 May 2000 18:59:17 -0400 (EDT)
Message-Id: <200005302259.SAA20792@speaker-for-the-dead.mit.edu>
To: Greg Hudson <ghudson@mit.edu>
CC: bugs@mit.edu
In-reply-to: "[17399] in Athena Bugs"
From: Jonathon Weiss <jweiss@MIT.EDU>


We seem to have fixed (or worked around) the first problem, and not
seen the second one again.  I'm closing the report.

     Jonathon



   [17399]  daemon@ATHENA.MIT.EDU (Greg Hudson) Athena Bugs 12/01/99 21:41 (29 lines)
   Date: Wed, 1 Dec 1999 21:41:14 -0500 (EST)
   From: Greg Hudson <ghudson@MIT.EDU>

   Today I looked at a couple of machines in the field which had
   experienced corruption on the root filesystem, probably after an
   unclean shutdown of some kind.

   The first machine was an Ultra 5.  It had a four-letter name I don't
   recall right now.  At boot time, it got stuck on "retrying host
   configuration", which is a message from /etc/init.d/rootusr.  The
   cause of the problem was that /etc/hostname.hme0 had been truncated to
   zero length.  There were a pile of fsck errors on the / and /var
   partitions (I would guess also on the /usr partition, but I didn't try
   that).  Mounting the root partition read-write manually and running
   syncconf caused the machine to come up okay.

   The second machine was hudson, a Sparc 5.  It was booting fine but
   xlogin was complaining "workstation failed to activate successfully."
   The machine wasn't getting cluster information.  I found two
   incidences of corruption: /etc/athena/version had a bunch of NUL bytes
   appended to it, and /etc/named.conf had been truncated to 0 length.

   Lou claims that the first problem is happening very frequently in the
   field since the last patch release.  I don't know what the patch
   release could have to do with it, really.  I have no idea what is
   causing these (presumed) unclean shutdowns, or why the local
   filesystems are experiencing corruption as a result.
   --[17399]-- (nref = [17400])

home help back first fref pref prev next nref lref last post