[1734] in Release_7.7_team

home help back first fref pref prev next nref lref last post

IRIX OS checking

daemon@ATHENA.MIT.EDU (Robert A Basch)
Mon Mar 29 20:49:16 1999

To: release-team@MIT.EDU
Date: Mon, 29 Mar 1999 20:49:09 EST
From: Robert A Basch <rbasch@MIT.EDU>

An outline of a scheme to check the integrity of the OS for IRIX...

The setup:

1) Use "showprods" to extract a list of all OS products from inst's
data base.  Parse the output of "showprods -v" (which outputs product
specifications) to determine which products are machine-dependent.
Separate the products into a "common" list, and a per-architecture
machine-dependent list.

2) Use "showfiles" to extract a list of all OS files from inst's data
base.  Machine-dependent and config files are flagged in the output;
we forget about the latter, and separate the former into a list for
the particular architecture.  Add to the machine-dependent list any
files belonging to products that are machine-dependent.  Everything
else goes to a list of "common" files.  Do this process to get the
machine-dependent file list for each architecture we support.

3) Parse the ".idb" files in the distribution directories used in the
install, in order to find "duplicate" machine-dependent files,
i.e. files whose path is non-unique in the distribution, and thus
cannot be assumed to be correct for a particular architecture in the
os volume.  Separate the machine-dependent list into these duplicate
files, and all others.

4) Generate "stats" files, using a new program which stat's a given
list of paths, and outputs relevant info: file type (regular, symlink,
directory); for symlinks, the symlink content; for directories and files,
stat info (mode, uid, gid, and, for files, size).  Note that neither
file modification time nor checksum would be recorded.  The mod time
is useless, since files installed by inst get the current time as
the mod time; the checksum is next-to-useless, because rqs'ed files
can have differing checksums, and checking the checksums would take
far longer than we would ever want to do.  For each architecture,
generate stats for "unique" machine-dependent files, and for
"duplicate" machine-dependent files, using a newly installed root
target.  Also generate one "common" stats file.

5) Install the stats files and product lists in the IRIX install volume. 


At check time:

1) Generate a list of "exception" files, which we do not want to
check: Begin with a global exception list, if any, read from the
install volume (these are files whose status is apparently incorrect
in the inst data base, e.g. a file which should have been marked as
config, or a file whose history should not have been recorded); add
anything tracked from the srvd, using the srvd stats file; for a
private machine, add any files belonging to any products installed (or
removed) locally, plus any file in a local exception list.

2) Run a new program which checks the entries in a given stats file
produced above, skipping any specified exceptions, and fixing any
damage found by recreating symlinks and directories, and copying files
from the os hierarchy.  The program would also use a reference
timestamp file, ensuring that any file has not been modified more
recently than the reference file.  Check against the stats file of
"unique" machine-dependent files for the particular architecture, if
present, and against the "common" stats file.  In order to check the
"duplicate" machine-dependent files, we would need to maintain a
hierarchy of these files per-architecture in the install volume.  My
estimate is that each such tree would consume less than 20 MB, so that
maintaining a tree for all of our current supported architectures
would require about 100 MB.

3) Maintain a "touch" file, to be used as the reference file above,
whose modification time must be earlier than that of any file checked.
This file would be touched at the end of the checking procedure, and
after an update which installed or removed OS products.


Notes:

1) There are clearly weaknesses with this approach, including:
only files installed and recorded by inst will be checked, which
omits certain critical files, such as /unix; it depends on a local
reference file for modification time checking; generating the
stats and product list files, etc., is tedious, and must be redone
whenever we change the OS.

2) The stats file should also note whether a file should get rqs
processing after being restored from the os volume.  I'm thinking of
leaving this unimplemented for now, at least until I resolve some
other rqs issues for 6.5 (and make sure I fully understand rqs
processing).

3) In testing, it takes a little over 3 minutes to do a complete check
of "unique" machine-dependent files and "common" files.  (There are a
small number of "duplicate" machine-dependent files, so adding that
case should not add significantly to the time).

home help back first fref pref prev next nref lref last post