[135] in Athena_Backup_System
Next Meeting: Thu Nov 16 @3:00 PM E40-316
daemon@ATHENA.MIT.EDU (Bill Cattey)
Mon Nov 13 18:53:37 1995
Date: Mon, 13 Nov 1995 18:53:07 -0500 (EST)
From: Bill Cattey <wdc@MIT.EDU>
To: athena-backup@MIT.EDU, mbarker@MIT.EDU
The next meeting of the Athena Backup System team will be at 3:00 on
Thursday November 16 in the small conference room on the third Floor of
E40 -- E40-316.
Agenda:
Review Status
Jweiss: overview doc
Delgado: slave status, and merge of new content into Master doc.
vrt: test plan stuff.
wdc: will try and find the fabled detailed acceptance criteria doc of
yore (in Mac format?)
This meeting is expected to be SHORT.
----
Gems from previous meeting, 9 November 1995:
We established a direction to take the documentation so as to be a more
approachable description of the system:
1. The overview document will be revised to reflect current
implementation practice.
2. The slave document and master document will refer to the overview document.
3. The question of which part is responsible for telling the operater
"MOUNT THIS TAPE" will be explicitly answered.
4. The operator's guide will begin to be written next week by Dave
Krikorian with significant collaboration by Brian Melansen. (Then we'll
see if the customer's idea of what is done syncs up with what has been
developed!)
5. The document given to the person who loads tapes will be one page or
less in length.
We spent time clarifying our understanding of the various requirements
for operating the system in the event of a database failure. In order
of decreasing priority, a need was recognized to be able to:
1. restore a crashed disk partition from tape.
2. restore a specific AFS volume.
3. perform a manual backup.
4. get the database back up and resume full-function backups.
To perform a manual backup, the pre-specified test program for the tape
slave which pretends to be the master, could be written in such a way as
to:
1. make ASCII logs to the local disk that would get merged into the
database when it came back up (by using a simple SQL utility.)
2. stub out the tape slave's request for tape label verification so that
only UNLABELED tapes would be used (the implication here is that you
would be required to use blank tapes instead of database-chosen
recyclables, and that a separate utility would be used to erase the tape
label of a tape CAREFULLY chosen by hand to be recycled during a
database outage.)
Operational procedures as documented in the operator's guide and as
implemented in the master would:
1. save a printout of the backup schedule on a regular basis so that
people would know which backups to perform if the database went down.
2. an ASCII format of the database would be saved at the same time as
the binary format during the normal database backup procedures
3. the gazette log produced on every tape operation would be
so-formatted as to be usable as an ASCII transaction log so that in the
event of a database failure, that and the ASCII db backup could be
reviewed to locate volumes for restore, and to know which backups had
been done successfully in the event of database down time.
We hope and expect never to need these degraded modes of manual
operation, but now they're a part of our design.
We decided to focus on the keyboard user interface to the system and
leave the Graphical Interface to the next version (although Dianne will
spend a little time to sync the present skeletal GUI with the system as
it is currently implemented to make it easier to resume this work later.)
We discussed a ranking of the various deployment platforms for the
Athena Backup System software:
Solaris and SunOS have highest priority. This is because SunOS holds
many file servers, but Operations has the easiest times making servers
out of Solaris because mkserv doesn't really exist for SunOS.
The AIX platform might be important if there is a requirement to
continuing to use the same servers we currently use for backups. (But
Ops wants to move away from RS/6000 servers and towards Sun servers.)
Perhaps IRIX will become significant if SGI's ever are chosen to be a
server platform.
------
That's how I remember the action as my notes recorded it.
I apologize in advance for any incorrect reproductions.
-wdc