[233] in Athena_Backup_System
change host of master
daemon@ATHENA.MIT.EDU (Diane Delgado)
Fri Apr 19 09:34:00 1996
To: athena-backup@MIT.EDU
Date: Fri, 19 Apr 1996 09:33:51 EDT
From: Diane Delgado <delgado@MIT.EDU>
Here is the first pass at the change host of master functionality:
Slave
a. When a slave is started the master's host is specified via a command
line option.
b. We will now add a config file for the slave where admins can specify the
master's host. The config file will have a line with the following
syntax:
master: green-acres
Where the actual host name will be substituted for "green-acres"
c. A slave can be forced to re-read its config file by sending it a
kill -HUP
This actually has the side effect that the slave will internally
set a variable which keeps track of the master's hostname. The Slave
uses this variable when it attempts to setup a connection with the master.
---------------
Master
a. Moving the Master to another host (non-disaster).
In addition to moving binaries and the dbms files,
the administrator must move the "jobs" directory and its
contents (located in /var/abs). The "jobs" directory contains
information about what is currently executing. Moving
this directory and its contents will ensure that the Master
continues to have an accurate picture of what tasks are
executing on which slaves. This directory and contents
must exist before the Master is started (the Master reads
this directory and its contents when it first starts up).
Moving the jobs information implies the slaves do not need to
re-execute the registration procedure. Admins can verify the master's
knowledge of the current state of things by executing the "show jobs"
command after the master has started.
(Please let me know if having to copy this directory and
contents is something that would NOT work well or would
be inconvenient - we can think of something else).
b. Disaster mode
It is concievable that the information in the jobs directory might
not be available during disaster mode.
There are a number of issues relating to disaster mode that aren't yet
resolved which will impact the handling of a new master location; e.g.,
does the fake master keep any information about the slave activities?
Does the fake master keep track of what slaves are configured
into the system?
If the fake master does not maintain state information then
Slaves do not need to re-execute the registration process,
This implies the fake master will never return a "no such job"
error to a slave, and the fake master accepts calls from
any authenticated slave in its realm.