[66] in Athena_Backup_System

home help back first fref pref prev next nref lref last post

Comments on error handling

daemon@ATHENA.MIT.EDU (Richard Basch)
Sun Jan 15 09:54:34 1995

Date: Sun, 15 Jan 1995 09:54:07 -0500
To: jweiss@MIT.EDU, athena-backup@MIT.EDU
From: "Richard Basch" <basch@MIT.EDU>


For slave->master auth errors, I would suggest that the initial
connections for the master->slave communications be mutually
authenticated.  This will reduce the chance that there is going to
later be an "auth error" due to a key change, or such.

Also, for "volume has moved", in the data structure returned, it should
mention where it has moved to.  The master can decide if this is something
it can re-schedule, or not, based on various site criteria.  For instance,
if I am asking for an archive copy of "reference" volumes, and one
reference volume has moved, it can be rescheduled.  If it has moved between
servers, then perhaps the server dumpsets can be updated.  If the rules
are non-deterministic, then mark it as a volume that needs a sys. admin.
to investigate.

For "drive time-out", I would say that you should schedule a retry.
Perhaps, the drive was simply initializing (some will retension tapes, and
not accept requests in that time, etc.)  If the retry fails, abort the dump.
In either case, continue to report it to the master.

-Richard

home help back first fref pref prev next nref lref last post