[31826] in Hotline Meeting

home help back first fref pref prev next nref lref last post

SIPB AFS cell stability problems

daemon@ATHENA.MIT.EDU (ghudson@MIT.EDU)
Tue Jan 30 01:47:26 1996

From: ghudson@MIT.EDU
Date: Tue, 30 Jan 96 01:46:36 -0500
To: athena-outage@MIT.EDU
Cc: sipb-afsreq@MIT.EDU

SIPB is encountering serious stability problems with the new AFS
server software we just installed, despite running the same software
as the Athena cell does.  The effects of this instability will be:

	* The cell may fail at any point during the next 24 hours.
	  This may cause file accesses on workstations to hang
	  indefinitely (even for accesses to the Athena cell, due to
	  design failures in Unix and the AFS client code) until the
	  problem is repaired (which generally takes 10-20 minutes
	  after the problem is brought to the attention of an AFS
	  maintainer).

	* We may schedule an outage tomorrow night to revert the cell
	  back to the old AFS software.  If the problem persists and
	  is serious, we may do an emergency cutover back to 3.2
	  during the day.  Such an outage might last as long as half
	  an hour, but would proabbly not cause workstations to hang.

We are continuing in our current state for another day because we are
guessing that rebooting the problematic servers may alleviate the
problem.  We recognize the negative effects that this instability has
on the Athena environment, and will do our best not allow the problem
to persist for more than 24 hours if we continue to have failures.

Sorry for the inconvenience.


home help back first fref pref prev next nref lref last post