[16301] in Athena Bugs

home help back first fref pref prev next nref lref last post

Re: sun4 8.2.9: syslogd is six feet under?

daemon@ATHENA.MIT.EDU (John Hawkinson)
Sat Sep 5 00:20:01 1998

Date: Sat, 5 Sep 1998 00:19:58 -0400
To: Greg Hudson <ghudson@MIT.EDU>
Cc: bugs@MIT.EDU
In-Reply-To: "[16298] in Athena Bugs"
From: John Hawkinson <jhawk@MIT.EDU>

| It's interesting that the machines have no syslog.pid files (unless
| jhawk is confused and actually did look for /etc/syslogd.pid instead
| of /etc/syslog.pid).  If Dan's race is at fault, then newsyslogd would

Oops, sorry for the error in my mail, indeed I was looking for syslog.pid.

Actually, though, it appears that machines running syslogd are missing the
pid file. For instance:

[x15-cruise-basselope!jhawk] ~> ps -ef | grep syslog
   jhawk 23829 23564  0 00:05:51 pts/0    0:00 grep syslog
    root 28960     1  0   Sep 03 ?        0:00 syslogd -d
[x15-cruise-basselope!jhawk] ~> ls -ld /etc/syslog*
-rw-r--r--   1 root     other        588 Aug 21 14:45 /etc/syslog.conf
[x15-cruise-basselope!jhawk] ~> 

It also seems that receipt of a HUP causes sylogd to restart without
debugging, which is irritating.

Unlike Bob, HUPing seperated by a second doesn't seem to cause
x15-crusie-basselope's syslogd to die.

On the other hand, it appears syslogd is running, but not operational
("logger -p local0.info test" didn't show up in /var/adm/messages).

Interestingly, only one LWP seems to exist:

[x15-cruise-basselope!jhawk] ~> ps -Llp 28960
 F S   UID   PID  PPID   LWP  C PRI NI     ADDR     SZ    WCHAN TTY     LTIME CMD
48 S     0 28960     1     5  0  41 20 60f1b020    448    2e63c ?        0:00 syslogd

And truss sez:

[x15-cruise-basselope!jhawk] ~# truss -p 28960
lwp_sema_wait(0x0002E638)       (sleeping...)
^C[x15-cruise-basselope!jhawk] ~#


Not sure what to conclude about the WCHAN, seems weird:

[x15-cruise-basselope!jhawk] /proc/28960/object# adb `which syslogd`
0t28960:A
process 28960 stopped at:
0xef639618:     ta      0x8
2e63c/X
_end+0xa4:      0
2e63c?X
_end+0xa4:      0
?i
_end+0xa4:      unimp   0x0
/i
_end+0xa4:      unimp   0x0
:R
^D


The process doesn't seem to respond to SIGHUP, SIGUSR1, or SIGTERM...

Restarting it yields an /etc/syslog.pid (without -d this time).
HUPing in rapid successfion yields a return to the clam-state (and a
reduction from 7 LWPs to 1).

This suggests that there may be multiple problems going on here.

--jhawk

home help back first fref pref prev next nref lref last post