[66] in Athena_Backup_System
Comments on error handling
daemon@ATHENA.MIT.EDU (Richard Basch)
Sun Jan 15 09:54:34 1995
Date: Sun, 15 Jan 1995 09:54:07 -0500
To: jweiss@MIT.EDU, athena-backup@MIT.EDU
From: "Richard Basch" <basch@MIT.EDU>
For slave->master auth errors, I would suggest that the initial
connections for the master->slave communications be mutually
authenticated. This will reduce the chance that there is going to
later be an "auth error" due to a key change, or such.
Also, for "volume has moved", in the data structure returned, it should
mention where it has moved to. The master can decide if this is something
it can re-schedule, or not, based on various site criteria. For instance,
if I am asking for an archive copy of "reference" volumes, and one
reference volume has moved, it can be rescheduled. If it has moved between
servers, then perhaps the server dumpsets can be updated. If the rules
are non-deterministic, then mark it as a volume that needs a sys. admin.
to investigate.
For "drive time-out", I would say that you should schedule a retry.
Perhaps, the drive was simply initializing (some will retension tapes, and
not accept requests in that time, etc.) If the retry fails, abort the dump.
In either case, continue to report it to the master.
-Richard