[4955] in Athena Bugs

home help back first fref pref prev next nref lref last post

Re: New NOC

daemon@ATHENA.MIT.EDU (tom@MIT.EDU)
Sun May 20 13:24:13 1990

From: tom@MIT.EDU
Date: Sun, 20 May 90 13:23:43 EDT
To: dennis@MIT.EDU
Cc: network@MIT.EDU, bugs@MIT.EDU

    To: network@MIT.EDU
    Subject: New NOC
    From: dbaron@MIT.EDU (Dennis Baron)
    Date: Sun, 20 May 90 11:15:06 EDT
    Sender: dennis@ATHENA.MIT.EDU

    Looked nice but I think it died.  I got:

    Sender: THE NOC! <rcmd.doghouse>
    Time:   10:36:50

    echo service on kerberos.mit.edu has not responded in 13 seconds.
    (2 attempts.)


A very strange thing happened. At approximately 10:58am, doghouse dropped
off the network (no errors). Monitor did crash as well, soon after it sent
the notice out about kerberos. It attempted to send two more notices within
the next 30 minutes about two other events. The first notice no one saw...
and the second notice caused the following crash. The date of the core dump
was about the time doghouse dropped of the network, but well after the
program stopped updating its logs, in other words it was caught somewhere,
perhaps in the zephyr library (6.4r).

If this is so, then it makes sense that it took 549 seconds (see below) to
make three polls, since it may have been caught in a previous notification
sent out. This machine is configured such that it should take about 15-30
seconds to note the outage.

I'm sending this to bugs because it crashed in a standard library.  

send_to_kdc.send_to_kdc(0x1aae0, 0x1afcc, 0x170bc) at 0x6eb7
get_ad_tkt.get_ad_tkt(0x134b8, 0x134bf, 0x170bc, 0x60) at 0x655e
krb_mk_req(0x7fffd5e0, 0x134b8, 0x134bf, 0x170bc, 0x0) at 0x5ae8
ZMakeAuthentication(0x7fffe040, 0x7fffdb40, 0x320, 0x7fffdb3c) at 0x3b71
Z_FormatHeader(0x7fffe040, 0x7fffdb40, 0x320, 0x7fffdb3c, 0x3b4c) at 0x4c3d
ZFormatNoticeList(0x7fffe040, 0x7fffe104, 0x4, 0x7fffde94, 0x7fffde90, 0x3b4c) at 0x53fd
ZSrvSendList(0x7fffe040, 0x7fffe104, 0x4, 0x3b4c, 0x534e) at 0x4259
ZSendList.ZSendList(0x7fffe040, 0x7fffe104, 0x4, 0x3b4c) at 0x4236
zsend(0x7fffe040, 0x7fffe104, 0x4, 0x1) at 0x183c
zsend_message(0x1285d, 0x22618, 0x22638, 0x1269c, 0x7fffe104, 0x4) at 0x1811
sm_notify.sm_notify(0x22600, 0x2ee50, 0x32e00, 0x7fffe224) at 0x177e
sm_event(event = 0x2265c, status = -102, level = 1, blurb = "timeout", message ="has not responded in 549 seconds.\n(3 attempts)", misc = "The following services affected: \necho.hyperion.mit.edu\n"), line 122 in "sm_events.c"
echo_monitor(0x22600) at 0x1dc4
sm_mainloop() at 0x68a
main.main(0x1, 0x7fffe71c, 0x7fffe724) at 0x25b


home help back first fref pref prev next nref lref last post