[1452] in SIPB-AFS-requests

home help back first fref pref prev next nref lref last post

Re: server restarts

daemon@ATHENA.MIT.EDU (mhpower@MIT.EDU)
Mon Jul 18 06:03:56 1994

From: mhpower@MIT.EDU
To: sipb-afsreq@MIT.EDU
In-Reply-To: "[1274] in SIPB-AFS-requests"
Date: Mon, 18 Jul 94 06:03:40 EDT

>What would be useful, though, is if we had a process running on each
>server that frequently (once a minute?) statted the FileLog and
>VolserLog, read the bottom if the mtime changed, pattern-matched any
>new lines against some set of don't-care strings (e.g., RCallBack
>failed) and sent the others out via syslog ...

I finally found some time to work on this (the quoted mail was from
four months ago, in case you don't remember it immediately). The code
I currently have will read FileLog and syslog each line that it finds
there. Rather than "read the bottom if the mtime changed", I just keep
the file open, read all the lines that are available at the time, wait
out some delay interval (currently 1 minute), and then try to read
additional lines starting from that offset (i.e., as in the "seek"
example in the perl book). I also check the inode number every minute
and re-open FileLog if that changes (e.g., on fileserver restarts).

There are also some translations done if the FileLog line matches
certain patterns. In particular, for all of the common occurrences of
IP addresses (i.e., packets dropped or callbacks failed), the address
is resolved to a hostname if possible, or else converted to the
standard 18.x.y.z format (I was never a big fan of the backwards hex
strings). Also, the volume name can be looked up for every occurrence
of "Cannot read volume header". I made that one optional (you need to
set "lookupvol"), since it's really, really slow.

I played around with this on ronald-ann for a while (the code is
currently in /var/local/lib/perl). I think it works ok, although I'll
admit I ran "su daemon" every time before using it, and also gave it a
test copy of the FileLog, rather than giving it read access to
/usr/afs/logs. Currently, nsyslog.conf is set up for it to log to a
local file, although I did test syslogging via Zephyr and to charon,
and those both of course also worked.

Is anyone interested in looking over this, or maybe checking the code
against the log files so that less common log entries can be logged at
higher priority (e.g., local2.warning rather than local2.info)? Also,
presumably other files such as VolserLog should also be processed.

Matt

home help back first fref pref prev next nref lref last post