[7259] in testers

home help back first fref pref prev next nref lref last post

Re: 9.4.9 linux machines with trashed cache partitions

daemon@ATHENA.MIT.EDU (Jonathon Weiss)
Thu Jul 14 09:01:56 2005

Message-Id: <200507141301.j6ED1i6h002881@distraction.mit.edu>
From: Jonathon Weiss <jweiss@MIT.EDU>
To: Kevin Chen <kchen@MIT.EDU>
cc: Jonathon Weiss <jweiss@MIT.EDU>, testers@MIT.EDU
In-reply-to: Your message of "Thu, 14 Jul 2005 08:33:00 EDT."
             <Pine.LNX.4.62L.0507140832001.19175@scyther.mit.edu> 
Date: Thu, 14 Jul 2005 09:01:44 -0400

> On Thu, 14 Jul 2005, Jonathon Weiss wrote:
> 
> > Something rebooted a bunch of the w20 cluster early-test linux
> > machines shortly after noon yesterday (they were still running 9.4.9,
> > mail on that soon).  Almost all of them failed to come back up on
> > their own.  The common failure was a failed fsck on the afs cache
> > partition, because files contain "Illegal blocks".  I mkfs'd the cache
> > partition, since that was even easier than fsck -y and they all came
> > up fine.  I'm a little concerned about the underlying cause, and we
> > should probably do a little testing on something that has updated to
> > 9.4.10.
> 
> I needed to do this every time I rebooted as a result of a kernel oops. 
> Whether that's not supposed to happen, I don't know, of course.

Oh, it's fairly clear this isn't *supposed* to happen.  The real
question is whether it is another symptom of the same problem as the
OOPSes, or failing that another bug eliminated by backing out the
newest AFS client.

	Jonathon


home help back first fref pref prev next nref lref last post