[4283] in Athena Bugs

home help back first fref pref prev next nref lref last post

Disk Server Problems Damage Student Opinion

daemon@ATHENA.MIT.EDU (tldavis@ATHENA.MIT.EDU)
Wed Feb 21 18:26:12 1990

From: tldavis@ATHENA.MIT.EDU
To: bugs@ATHENA.MIT.EDU
Cc: lcomeau@HSTBME.MIT.EDU
Date: Wed, 21 Feb 90 18:24:19 EST
What is it that is happening when everyone starts to get AFS and NFS
errors popping up and then suddenly everything freezes for 1 - 20 minutes?
This happens frequently at our cluster (E25-131), and I've seen it happen
elsewhere as well.

I develop Athena-based simulation course software for the Division of
Health Sciences and Technology.  Our students uniformly like the Athena
look-and-feel and the course software, but I have had MANY complaints
and questions like
"What does this mean, NFS error..." and "Why is everything stopped?"  

For many, it is enough to kill any enthusiasm for the course software. 

It is EXTREMELY frustrating when a server goes on the blink, mainly
because every process reading or writing to it gets stuck in a DISK
WAIT, from which there is absolutely no recovery for several minutes. 
Why can't csh KILL or at least SUSPEND (^Z) those processes while they
are waiting for the disk server?

I know that once I have a csh going, I can create my new shells with the -f
(fast) option to avoid that deadly search of my home directory.  I guess I'm
going to have to add a "fast csh" to my window manager menu for such
emergency exits.

Basically, I think Athena is great except for this one giant problem which is
absolutely undocumented, as far as I can tell.  My current answer to student
inquiries is to say "The network or the file server is sick.  Try again in a
few minutes, and if it still doesn't work just come back later."  Is this
an appropriate answer?  Often, students are unable even to logout.  Should
a workstation in this state be rebooted?

home help back first fref pref prev next nref lref last post