[31826] in Hotline Meeting
SIPB AFS cell stability problems
daemon@ATHENA.MIT.EDU (ghudson@MIT.EDU)
Tue Jan 30 01:47:26 1996
From: ghudson@MIT.EDU
Date: Tue, 30 Jan 96 01:46:36 -0500
To: athena-outage@MIT.EDU
Cc: sipb-afsreq@MIT.EDU
SIPB is encountering serious stability problems with the new AFS
server software we just installed, despite running the same software
as the Athena cell does. The effects of this instability will be:
* The cell may fail at any point during the next 24 hours.
This may cause file accesses on workstations to hang
indefinitely (even for accesses to the Athena cell, due to
design failures in Unix and the AFS client code) until the
problem is repaired (which generally takes 10-20 minutes
after the problem is brought to the attention of an AFS
maintainer).
* We may schedule an outage tomorrow night to revert the cell
back to the old AFS software. If the problem persists and
is serious, we may do an emergency cutover back to 3.2
during the day. Such an outage might last as long as half
an hour, but would proabbly not cause workstations to hang.
We are continuing in our current state for another day because we are
guessing that rebooting the problematic servers may alleviate the
problem. We recognize the negative effects that this instability has
on the Athena environment, and will do our best not allow the problem
to persist for more than 24 hours if we continue to have failures.
Sorry for the inconvenience.