[16292] in Athena Bugs

home help back first fref pref prev next nref lref last post

sun4 8.2.9: syslogd is six feet under?

daemon@ATHENA.MIT.EDU (John Hawkinson)
Thu Sep 3 23:19:45 1998

Date: Thu, 3 Sep 1998 23:19:12 -0400
To: bugs@MIT.EDU
Cc: meeroh@MIT.EDU, sipb-staff@MIT.EDU
From: John Hawkinson <jhawk@MIT.EDU>


I'm kind of disturbed by this.

A large number of 8.2.9 machines in the SIPB office are not running
syslogd. Specifically, mary-kay-commandos, bobbi-harlow, and
x15-cruise-basselope all lack syslogds. bart-savagewood and tess-turbo,
however, are running syslogd.

tess-turbo has been up longer than bobbi-harlow (10 days vs. 6), so it's
not as if there was a "simple" network event that caused all the syslogds
to die.

This behavior has been observed by other people who appear to have
been lame in reporting bugs, so it's somewhat widespread.

On bh the last log was:

Aug 29 18:26:05 bobbi-harlow.mit.edu unix: afs: setting clock back 2 seconds (via 18.185.0.32 in cell athena.mit.edu).
Aug 29 18:26:05 bobbi-harlow.mit.edu last message repeated 1 time

On xcb:

Aug 23 22:34:44 x15-cruise-basselope.mit.edu ttsession[27039]: exiting

On mkc:

Thu Aug 27 03:47:09 1998 mary-kay-commandos.mit.edu newsyslog[5291]: logfile turned over


A particularly interesting fact is that none of these machines have
syslogd.pid files. This suggests that something that is causing
syslogd to exit cleanly and remove it's pidfile. Either that, or
something else is removing them. I'm not sure that either of these
makes a lot of sense.

I've restarted syslogd with:

[x15-cruise-basselope!jhawk] /# (syslogd -d >& /syslogd.debug &)

Perhaps this will give some insight.

Checking the cluster:

[bobbi-harlow!jhawk] ~> awk -F: '/W20/ && /SUN/{printf "nc -zuv -w1 %s 514&\n", $2}' /afs/net/admin/hosts/hstath.txt > /tmp/p0

[bobbi-harlow!jhawk] ~> sh -x /tmp/p0 > /tmp/p1
[bobbi-harlow!jhawk] ~> grep -c refused /tmp/p1
55

So, that is, 55 suns in W20 have dead syslogds.

This sucks.
I lose two points for failing to be proactive and notice this.
Some other people should lose more points, though.

--jhawk

home help back first fref pref prev next nref lref last post