[626] in Zephyr_Bugs

home help back first fref pref prev next nref lref last post

Re: serious bug in brain dump code (zephyr 2.0 beta 2)

daemon@ATHENA.MIT.EDU (E. Jay Berkenbilt)
Wed Jul 12 13:06:49 1995

Date: Wed, 12 Jul 1995 13:03:22 -0400
From: "E. Jay Berkenbilt" <ejb@ERA.COM>
To: ghudson@MIT.EDU
Cc: ghudson@MIT.EDU, bug-zephyr@MIT.EDU
In-Reply-To: <199507121619.MAA21620@glacier.MIT.EDU> (message from Greg Hudson
	on Wed, 12 Jul 1995 12:19:37 EDT)


Well, the new server with your patch and my trivial patch works fine.
For amusement value and potential usefulness to other testers, I'll
describe what I did:

I observed that an old server would not accept brain dumps from a new,
but a new would accept brain dumps from an old fine (with the missing
information).  I spent about 20 minutes hacking a special transition
version of the zephyr server that would, upon receiving a XCPU signal
(which happens to exist on the platform running our zephyr server),
would iterate through the locations[] array and call gethostbyname()
on each host, replacing its address appropriately.  Then it would
iterate through the clients table and update the information there
based on a search through the locations[] array matching on principal
and port.  I then performed the following steps:

1. kill slave server
2. start my special hacked transition server on the slave
3. send SIGXCPU to the hacked transition server thereby causing its
   locations and clients databases to become accurate
4. kill the master server
5. restart with a clean (no transition code) patched server
6. kill and restart slave

After this point, both the slave and master had all up-to-date
information, and no users lost subscription or location information,
so this went by unnoticed by the user community. :-)

In case anyone cares, here's the transition patch relative to the
otherwise patched server.  This code is not necessarily
well-written/robust, portable, or efficient.  I wrote it in 20 minutes
for a single use and then threw it away.  I am not even saving this
patch except in my log of zephyr-related mail.  This patch is provided
for amusement value and/or for use by a tester in the same bind (not
wanting to have to do another global restart for all users).  Needless
to say, I hope, it should definitely not be considered for actual
inclusion in the source tree.... :-)

Anyway, now we're happily running zephyr 2.0 (beta 2 +) with all the
functionality of our old installation as well as considerable
additional functionality.  We'll be eagerly awaiting the final 2.0.


--
E. Jay Berkenbilt (ejb@ERA.COM)  |  Member, League for Programming Freedom
Engineering Research Associates  |  lpf@uunet.uu.net, http://www.lpf.org  

===========================================================================

--- server/client.c.qqdist	Fri Jul  7 18:11:53 1995
+++ server/client.c	Wed Jul 12 12:32:56 1995
@@ -219,3 +219,26 @@
     return NULL;
 }
 
+extern struct in_addr get_addr_for_client(char *, unsigned int);
+void repair_client_info()
+{
+    Client *client;
+    int i;
+    struct in_addr addr;
+
+    for (i = 0; i < HASHSIZE; i++) {
+	for (client = client_bucket[i]; client; client = client->next)
+	{
+	    addr = get_addr_for_client(client->principal->string,
+				       client->addr.sin_port);
+#if 0
+	    printf("Changing information for client %s/%d from %s to %s\n",
+		   client->principal->string, client->addr.sin_port,
+		   inet_ntoa(client->addr.sin_addr),
+		   inet_ntoa(addr));
+#endif
+	    client->addr.sin_addr = addr;
+	}
+    }
+}
+
--- server/main.c.qqdist	Fri Jul  7 18:12:12 1995
+++ server/main.c	Wed Jul 12 12:24:11 1995
@@ -63,6 +63,7 @@
 static RETSIGTYPE sig_dump_strings __P((int));
 static RETSIGTYPE reset __P((int));
 static RETSIGTYPE reap __P((int));
+static RETSIGTYPE repair_info __P((int));
 static void read_from_dump __P((char *dumpfile));
 static void dump_db __P((void));
 static void dump_strings __P((void));
@@ -260,6 +261,9 @@
     action.sa_handler = sig_dump_db;
     sigaction(SIGFPE, &action, NULL);
 
+    action.sa_handler = repair_info;
+    sigaction(SIGXCPU, &action, NULL);
+
 #ifdef SIGEMT
     action.sa_handler = sig_dump_strings;
     sigaction(SIGEMT, &action, NULL);
@@ -274,6 +278,7 @@
     signal(SIGUSR2, dbug_off);
     signal(SIGCHLD, reap);
     signal(SIGFPE, sig_dump_db);
+    signal(SIGXCPU, repair_info);
 #ifdef SIGEMT
     signal(SIGEMT, sig_dump_strings);
 #endif
@@ -719,3 +724,11 @@
     return;
 }
 
+extern void repair_location_info();
+extern void repair_client_info();
+
+static void repair_info(int s)
+{
+    repair_location_info();
+    repair_client_info();
+}
--- server/uloc.c.qqdist	Fri Jul  7 22:49:01 1995
+++ server/uloc.c	Wed Jul 12 12:34:07 1995
@@ -1190,3 +1190,49 @@
     free(loc->time);
     return;
 }
+
+void repair_location_info()
+{
+    struct in_addr old_addr, new_addr;
+    char *host;
+    struct hostent *hp;
+    int i;
+
+    for (i = 0; i < num_locs; i++)
+    {
+	host = locations[i].machine->string;
+	printf("%d: %s %s %d\n", i, locations[i].user->string, host,
+	       locations[i].addr.sin_port);
+	old_addr = locations[i].addr.sin_addr;
+	if ((hp = gethostbyname(host)) == NULL)
+	{
+	    printf("Unknown host %s\n", host);
+	}
+	else
+	{
+	    memcpy((char *)&(new_addr.s_addr),
+		   (char *)hp->h_addr_list[0], hp->h_length);
+	    locations[i].addr.sin_addr.s_addr = new_addr.s_addr;
+#if 0
+	    printf("  changed ip address from %s to %s\n",
+		   inet_ntoa(old_addr), inet_ntoa(new_addr));
+#endif
+	}
+    }
+}
+
+struct in_addr get_addr_for_client(char *user, unsigned int port)
+{
+    int i;
+
+    for (i = 0; i < num_locs; i++)
+    {
+	if ((strcmp(user, locations[i].user->string) == 0) &&
+	    (port == locations[i].addr.sin_port))
+	{
+	    return locations[i].addr.sin_addr;
+	}
+    }
+    printf("No match found for %s/%d\n", user, port);
+    return locations[0].addr.sin_addr;
+}

home help back first fref pref prev next nref lref last post