[18573] in Hotline Meeting
RE:AFS Yesterday
daemon@ATHENA.MIT.EDU (jjmorey@MIT.EDU)
Tue Sep 28 07:10:22 1993
From: jjmorey@MIT.EDU
To: hotline@MIT.EDU
Date: Tue, 28 Sep 93 07:10:04 EDT
------- Forwarded Message
Received: from MIT.MIT.EDU by po7.MIT.EDU (5.61/4.7) id AA03290; Mon, 27 Sep 93 18:48:09 EDT
Received: from BIG-SCREW.MIT.EDU by MIT.EDU with SMTP
id AA20488; Mon, 27 Sep 93 18:47:16 EDT
Received: by big-screw
id AA02375; Mon, 27 Sep 93 18:47:02 -0400
Date: Mon, 27 Sep 93 18:47:02 -0400
Message-Id: <9309272247.AA02375@big-screw>
From: Jeffrey I. Schiller <jis@MIT.EDU>
Sender: jis@MIT.EDU
To: dcns@MIT.EDU, acs@MIT.EDU, css@MIT.EDU, acmg@MIT.EDU
Cc: cfyi@MIT.EDU
Subject: Yet another bad day at Athena
We suffered a disk drive failure today on Maeander.MIT.EDU. Maeander is
one of our three AFS "database" server machines and also hosts a copy of
the upper levels of the AFS file tree (i.e., /afs/athena and all the
data to find /afs/athena/user/a/a...thru /afs/athena/user/z/z).
Apparently with three database servers and three copies of the file
tree, we are operating at about capacity. When we lost Meander, the
remaining two servers overloaded resulting in the poor performance that
we experienced today. An AFS software problem on ORF (another of the
database servers) also contributed to our problems.
As of this writing all three database servers are back in operation and
ORF's software problem is resolved. Things are returning to normal.
Over the next few days the members of the Distributed Systems Support
group will be installing additional database and AFS file tree servers.
By tomorrow (Tuesday) we should have four of each. Hopefully this will
improve overall system performance. We also continue to pursue software
improvements and reconfigurations that may improve performance as well.
Thank you for persevering with us while we endeavor to resolve these
problems.
-Jeff
------- End of Forwarded Message