[655] in SIPB-AFS-requests

home help back first fref pref prev next nref lref last post

Re: rosebud's bosserver dies

daemon@ATHENA.MIT.EDU (Bill Sommerfeld)
Sun Apr 5 12:33:19 1992

Date: Sun, 5 Apr 92 12:32:51 -0400
From: Bill Sommerfeld <wesommer@Athena.MIT.EDU>
To: Calvin Clark <ckclark@Athena.MIT.EDU>,
Cc: sipb-afsreq@Athena.MIT.EDU, sipb-staff@Athena.MIT.EDU
In-Reply-To: Calvin Clark's message of Sun, 5 Apr 92 05:46:48 EDT,

I've heard it claimed somewhere that what's going on is that the
bosserver restarts "too quickly", possibly fires up the (new?)
subprocesses, and then tries to grab the port again, which fails, so
it exits.  

This sounds *almost* plausible, particularly considering how fast &
loose the bosserver is with file descriptors.

There are a couple of possibilities here worth investigating, if
anyone has the time and rights to read the source..

 - is there a chance that when the bosserver exec's itself, not all of
FD 0..2 is open?

 - it looks like the RX file descriptor could potentially wind up as
one of 0..2 under certain conditions, and thus get passed to the
newly-exec'ed bosserver already open and bound to the bosserver
socket.  (Has this been happening every *other* week, or when rosebud
has been up for two Sundays in a row?).
 Maybe RX should be patched to "panic" if a socket it allocates is in
the 0..2 range..

 - maybe RX should be changed to mark the sockets it opens
"close-on-exec".. (ioctl (fd, FIOCLEX, 0)); that might cause it to
behave better in this case.

 - perhaps the bosserver should be changed to call rx_Init() *before*
calling ReadBozoFile(), so that if it fails to grab the port, it
doesn't leave unmanaged processes hanging around.

					- Bill

home help back first fref pref prev next nref lref last post