[1274] in SIPB-AFS-requests

home help back first fref pref prev next nref lref last post

Re: server restarts

daemon@ATHENA.MIT.EDU (mhpower@MIT.EDU)
Sun Mar 20 22:06:48 1994

From: mhpower@MIT.EDU
To: tlyu@MIT.EDU
Cc: sipb-afsreq@MIT.EDU
In-Reply-To: "[1273] in SIPB-AFS-requests"
Date: Sun, 20 Mar 94 22:06:26 EST

>rosebud and ronald-ann currently do not do weekly restarts.

Specifically, rosebud restarts every day at 4am, and ronald-ann
is not scheduled to ever restart.

>                       ... is there any good reason we should or
>should not do a weekly restart?

I guess the current arrangement gives users, for example, a copy of
the sipb locker on a frequently restarted server, and another copy on
a nonrestarting server. Possibly this is a good compromise in that
either restarting or running forever could sometimes lead to problems
with file access. Also, the cell has been running fairly reliably
with this arrangement for several months, and because of that I don't
see a clear motivation for any change to the setup.

What would be useful, though, is if we had a process running on each
server that frequently (once a minute?) statted the FileLog and
VolserLog, read the bottom if the mtime changed, pattern-matched any
new lines against some set of don't-care strings (e.g., RCallBack
failed) and sent the others out via syslog (maybe using higher
priorities for truly unusual entries). Probably we'd want the syslogs
to go into a file, maybe on some other machine, and via zephyr to afs
maintainers. If aix syslogd doesn't do that, I suppose we could have
messages sent out to a magic zephyr class until rosebud is replaced.

There could potentially be important things that do get logged, but no
one ever sees them because rosebud only keeps two days worth of log
entries. (I certainly don't check the logs every day, but I do read
all of them before ever trying to run a backup...)

Matt

home help back first fref pref prev next nref lref last post