[46479] in Hotline Meeting
Re: Case 132270: 2-032 cluster
daemon@ATHENA.MIT.EDU (Mike Whitson)
Mon Feb 1 15:25:30 1999
To: jjmorey@MIT.EDU
Cc: ops@MIT.EDU, hotline@MIT.EDU
From: Mike Whitson <mwhitson@MIT.EDU>
Date: 01 Feb 1999 15:25:27 -0500
In-Reply-To: hotline@MIT.EDU's message of Mon, 01 Feb 1999 08:48:33 EST
I have to apologize for not getting back to you when you paged about
this; I made a mistake when I took ASO pager duty and misconfigured
the vmail on x3-2624, so I was never paged when you left vmail.
If something like this happens again, the following information is
from the file /mit/ops/doc/Pager.Info, which describes procedures for
paging Athena Ops:
> Escalation:
> How to escalate a problem when (for one reason or another) the person
> on call does not respond.
> After the initial page, wait at least 30 minutes and page them
> again. If you left voicemail to page the person on call, then you
> should (if you are able ) look up who it is from
> /mit/ops/doc/Pager.Schedule and page them using 'beep' in the
> net-tools locker. This way if there is a voicemail problem they
> will be able to get the page.
> If there is no response to the second page, you should again wait at
> least 30 minutes and then page the whole Athena Server Operations
> group. This is accomplished using with the 'beep ops-group'
> command. You should put the word 'ESCALATE' as the first word so
> that everyone knows that there is a problem that the person on call
> has not responded to.
> For example, an escalation to the rest of the group might look like this:
> % add net-tools
> % beep ops-group
> Type your message now. End with control-D or a dot on a line by itself.
> (limitation of 60 characters enforced by metromedia)
> > ESCALATE file server ixion is down pls call 3-4435
> paging...
> ops-group: paged.
> Keep in mind that 'beep ops-group' pages everyone and should not be done
> lightly.
> In summary, in the event of an emergency while no one from Athena
> Server Operations is around:
> - Page the person on duty (via urgent vmail to x3-2624, or 'beep')
> - Page again, if no answer after 30 minutes (preferably using 'beep')
> - Page ops-group, if still no answer after another 30 minutes
In particular, I would have still received a page if it went out using
'beep' rather than the voicemail system.
I'm told that it turned out to be a network problem and was fixed, so
at least all's well that ends well. Again, I'm very sorry I didn't
get back to you in a timely fashion.
Mike Whitson
MIT/IS Athena Server Operations