[622] in Zephyr_Bugs

home help back first fref pref prev next nref lref last post

serious bug in brain dump code (zephyr 2.0 beta 2)

daemon@ATHENA.MIT.EDU (E. Jay Berkenbilt)
Tue Jul 11 09:56:57 1995

Date: Tue, 11 Jul 1995 09:53:48 -0400
From: "E. Jay Berkenbilt" <ejb@ERA.COM>
To: bug-zephyr@MIT.EDU


The brain dump code in the zephyr server is currently seriously
flawed.  If you have two servers, and one is killed and restarted,
when it reloads its information from the other server, it shows all
clients to be at the other server rather than where they really are.

For example, if I set up crash, burn, and soup as their own zephyr
realm with crash and burn running servers, and start zwgc as ejb on
soup, guest on crash, and potato on burn, kill -FPE on the zephyr
server (on either machine) gives this (for the list of clients):

ejb@local-realm/soup.ERA.COM/Tue Jul 11 09:42:17 1995/:0.0/NET_ANN/192.207.166.5/37838
guest@local-realm/crash.ERA.COM/Tue Jul 11 09:43:22 1995/soup:0/NET_ANN/192.207.166.11/1077
potato@local-realm/burn.ERA.COM/Tue Jul 11 09:44:43 1995/soup:0/NET_ANN/192.207.166.19/1071

Now, if I kill crash's server and restart it, a kill -FPE on crash's
new server gives this information:

ejb@local-realm/soup.ERA.COM/Tue Jul 11 09:42:17 1995/:0.0/NET_ANN/192.207.166.19/37838
guest@local-realm/crash.ERA.COM/Tue Jul 11 09:43:22 1995/soup:0/NET_ANN/192.207.166.19/1077
potato@local-realm/burn.ERA.COM/Tue Jul 11 09:44:43 1995/soup:0/NET_ANN/192.207.166.19/1071

now showing all clients connected to burn.  This means that after the
restart, only people actually logged in on burn get zephyrgrams sent
to them.  For everyone else, the send succeeds, but the message is
never received since it is directed to the wrong hostmanager.  After
this restart, burn's information stays correct, at least for a while. 

Needless to say, this is very serious since it completely defeats the
point of having multiple servers especially in a small environment.

(In our real environment, we have two servers: era and icarus, but the
zhm's only know about era.  This has always worked in the past.
Icarus, therefore, doesn't do anything but take braindumps from era.
That way, if the main server crashes or reboots, when it comes up,
users have not lost their subscription information.  I can reproduce
this bug in that scenario, or in the more normal one in which all
clients know about all servers.)

I'd like a patch as soon as you have one.  I may spend a few minutes
looking for this myself, but I'm way behind now in my real work...  (I
figure it's to everyone's advantage to have a good solid zephyr since
we've come to rely on it somewhat, and the old versions haven't ever
really been that robust.  In other words, what I'm doing is good for
my employer as well as for other zephyr users, but my employer is just
not currently aware of this.... :-])

--
E. Jay Berkenbilt (ejb@ERA.COM)  |  Member, League for Programming Freedom
Engineering Research Associates  |  lpf@uunet.uu.net, http://www.lpf.org  

home help back first fref pref prev next nref lref last post