[3199] in testers
sun4 [8.1.2]: zhm
daemon@ATHENA.MIT.EDU (Jonathon Weiss)
Wed Jun 4 19:25:38 1997
Date: Wed, 4 Jun 1997 19:25:14 -0400 (EDT)
To: testers@MIT.EDU
Cc: jweiss@MIT.EDU, cat@MIT.EDU
From: Jonathon Weiss <jweiss@MIT.EDU>
System name: the-other-woman.MIT.EDU
Type and version: SPARC/Classic 8.1.2
Display type: cgthree
What were you trying to do?
send a zephyrgram
What went wrong?
zhm was spinning
trying to send a zgram hung
zstat -h tow timed out
but receiving zgrams worked fine
What should have happened?
I should have failed to lose
Yo, got any documentation, or other info?
Yeah. shortly before I noticed lossage I got a "hm: looking for new
server" syslog, I'm guessing that this is because the server I'd been
talking to was braindummping to eratowhich cat was playing with.
(Note that I am *not* running with a named hack to talk to erato.)
I kill -STOP'd my zhm, and gcored it:
(gdb) where
#0 0xef639204 in _time ()
#1 0x14d78 in resend_notices ()
#2 0x13f98 in handle_timeout ()
#3 0x13300 in main ()
I'll keep the coredump around in case you want to look at it.
At the same time that I lost cat also lost. cat noted that kill -9
was needed to kill the zhm. At my request she was also able to repeat
the lossage by unplugging sakhmet from the net for some time and tehn
plugging it back in.
Interestingly, I have kill -9'd my zhm and am still receiving zgrams,
I didn't realize it was unnecessary for incomming zgrams. Learn
something new every day. (Unsurprisingly, restarting zhm seemed to
dump my subs.)