[1380] in SIPB-AFS-requests
Re: VRC16 monitors
daemon@ATHENA.MIT.EDU (mhpower@MIT.EDU)
Wed May 18 20:43:03 1994
From: mhpower@MIT.EDU
To: jweiss@MIT.EDU
Cc: yandros@MIT.EDU, sipb-afsreq@MIT.EDU, rtfm-maintainers@MIT.EDU,
charon-maintainers@MIT.EDU, usenet@MIT.EDU,
anxiety-maintainers@MIT.EDU
In-Reply-To: "[1372] in SIPB-AFS-requests",
"[1553] in RTFM_Maintainers_Archive",
"[0794] in Charon_Maintainers_Archive",
"[28763] in Usenet_Meeting"
Date: Wed, 18 May 94 20:42:01 EDT
> ... Matt, were your concerns addressed
>sufficiently?
Well, yandros wrote:
> ... If there's a problem with a
>machine, then someone who has gone to all the trouble to go into the
>machine room can, I hope, be bothered to plug a monitor into a machine
>before rebooting it.
Ok, here's the situation I was thinking of. A server machine crashes
and begins to automatically reboot (happens around once a month...).
As part of the crash, it printed the following Useful Error Message:
ULTRIX kernel developer hbinbet@raqxs1.enet.dec.com is a bonehead
Unfortunately, this message did not get written to disk, nor did it
get sent out over the net (obviously, in a lot of cases error messages
will be written to disk/net, but I've seen cases where they weren't.)
Soon after, the maintainer of SIPB-server-machine-X gets a noc
notification that SIPB-server-machine-X has not responded in a while.
Maintainer-person then heads into the machine room to see what's up.
Unfortunately, the monitor is connected to SIPB-server-machine-Y,
and by the time the cable (cables? I think there are at least two) is
located and swapped, SIPB-server-machine-X is already displaying some
other text on the monitor, and Useful Error Message is lost forever.
If the monitor had already been connected to SIPB-server-machine-X,
maintainer-person could've read the screen before it was too late.
Now, I realize this isn't the world's most likely scenario, and
although it could happen, I see that other people would rather have
the monitors in the office, Useful Error Messages notwithstanding.
So, I suggest we instead try to reduce the "by the time the cable..."
delay by training SIPB members in swapping monitors as quickly as
possible. We'll start in the SIPB office, and try to minimize the
point values we garner in the following competitive event:
Item Scoring
---- -------
elapsed time to swap monitor, per second +1 point
setting off machine-room alarm +3000 points
accidentally halting the machine to which +500 points
the monitor was previously attached
accidentally halting the machine to which +250 points
the monitor was to be moved
accidentally halting other machine(s). Each: +750 points
tripping over the power cord to prill +1500 points
answering ez question while on the way into -50 points
the machine room
entering competition without knowing the +5000 points
machine-room alarm combo
>Is there a *real* reason that we need a monitor connected to every
>server in the machine room at all times? Athena System Support
>doesn't seem to think so...
Athena System Support usually doesn't have people sitting around
near their machine rooms. We often have people near W20-575A, so
we have the opportunity to read occasional error messages that
systems support couldn't have read (i.e., the messages that don't
get logged to disk/net). Again, possibly this isn't a big deal.
Matt