[356] in Info-AFS_Redistribution
[Liz_Hines@transarc.com: problem with AFS 3.1a v3fshelper (AIX 3.x systems)]
daemon@ATHENA.MIT.EDU (Richard Basch)
Tue Oct 22 23:31:03 1991
Date: Tue, 22 Oct 91 23:32:02 -0400
To: info-afs@MIT.EDU
From: "Richard Basch" <basch@MIT.EDU>
Luckily, I never bothered upgrading this program on any of the machines
because I felt uneasy about replacing a known-to-be-working filesystem
helper with an untested, new-release product.
-R
------- Forwarded Message
Date: Tue, 22 Oct 1991 20:04:38 -0400 (EDT)
From: Liz_Hines@transarc.com
To: AFS_Contacts@transarc.com
Subject: problem with AFS 3.1a v3fshelper (AIX 3.x systems)
AFS Site Contacts:
We have discovered a problem with the AFS 3.1a v3fshelper program
(run by fsck) for AIX 3.1 systems - AFS volumes can be corrupted as
a result of running the AFS 3.1a v3fshelper program. The problem
occurs when fsck checks the filesystem and reports problems with the
inode map (detailed symptoms are at the end of this message).
This does not happen if the filesystem was unmounted cleanly or if
(we believe) the Journalled File System log has been replayed
successfully.
This only affects AFS servers running on AIX 3.1 (all versions
of AIX 3.1) that have been upgraded to AFS 3.1a.
Our recommended work-around is to install the AFS 3.1 v3fshelper
program (as opposed to the AFS 3.1a v3fshelper program) when upgrading
to AFS 3.1a.
If you have already upgraded to AFS 3.1a and v3fshelper may run and
modified the inode map (either manually or via the /etc/rc script),
you should immediately take the following steps to save your data:
1) backup the volumes on these partitions
2) move the volumes to other servers
3) install AFS 3.1 v3fshelper
4) use SMIT to recreate the Journalled File System on these partitions
5) move the volumes back to these partitions
Problems do not occur with the existing data until additional
files or volumes are created on the partition in question. Therefore,
you should be able to save the existing data and move it.
We are working on a fix for this problem, and we will keep
you informed of the progress.
Liz Hines
Product Support Manager, File Systems
Transarc Corporation
*********************************************************************
Detailed symptoms:
If you run fsck by hand, you will see something similar to
the following:
% fsck /vicepa
** Checking /dev/rlv01 (/vicep)
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Inode Map
Bad Inode Map; SALVAGE? y
** Phase 5b - Salvage Inode Map
** Phase 6 - Check Block Map
50 AFS files, 12 non-AFS files,
62 files 5120 blocks 3072 free
***** File system was modified *****
It is very possible that your AIX 3.1 system has the following
in the /etc/rc script:
# Perform file system checks
# The -f flag skips the check if the log has been replayed successfully
fsck -fp
This tells fsck to run and basically, say "yes" to any
questions, which would cause the inode map to be salvaged, without
notifying anyone.
When the fileserver process is restarted, volumes will probably be
accessible and valid. However, when subsequent data is added to
the partition (by creating files or volumes), volumes will be taken
off-line:
% vos listvol junior a -cell test.transarc.com
Total number of volumes on server junior partition /vicepa: 13
palo.rs.6 687709572 RW 433 K On-line
palo.rs.7 687709573 RW 46 K On-line
palo.rs.8 687709574 RW 46 K On-line
palo.rs.9 687709575 RW 433 K On-line
**** Could not attach volume 687709579 ****
**** Could not attach volume 687709567 ****
Total volumes onLine 11 ; Total volumes offLine 2 ; Total busy 0
At this point, the off-line volumes are in a state where a salvage
will probably remove them.
------- End Forwarded Message