[3150] in Release_7.7_team

home help back first fref pref prev next nref lref last post

Re: ext3 TSM backup problems

daemon@ATHENA.MIT.EDU (Bill Cattey)
Fri Feb 22 13:15:22 2002

Message-ID: <EwRcetBz0001QKg1ww@mit.edu>
Date: Fri, 22 Feb 2002 18:15:21 +0000 ()
From: Bill Cattey <wdc@MIT.EDU>
To: amb@MIT.EDU, Alex T Prengel <alexp@MIT.EDU>
CC: alexp@MIT.EDU, anger@MIT.EDU, davek@MIT.EDU, ops@MIT.EDU,
        release-team@MIT.EDU
In-Reply-To: <200202221742.MAA14737@dit.mit.edu>

(I've added ops and release-team to the CC list.  They have a stake here too.)

I will let Andrew reply to the technical facts of the best way to
implement the solution.  But I think its important for you folks to
understand why we went to EXT3 in the first place:

This year (for reasons that are unclear) we were getting a
disproportionate number of reports of machines in the clusters offline
with the console messages saying that a manual fsck was required to
bring the system back online.

With Solaris, we remedied this problem by turning on journaling.
Later, the Linux EXT3 filesystem became available which had
the same journaling capability.  In order to remedy the problem of
lots of systems dead in the clusters, we rushed this change out with
only a couple months of testing.

Perhaps the right approach is to see if we can enable EXT3 for / ONLY if
PUBLIC=true.  We don't expect people on non-private machines to be
running TSM.

I reiterate that it is a mystery to me how TSM can be taking such
inappropriate action because of the name of the filesystem type.  The
bits on the disk are the SAME.  That a remedy cannot be quickly crafted
for TSM raises questions in my mind about the architecture of the
software.

-wdc
---- summary background for newly added folks ----

The current Linux TSM backup client is not able to back up ext3 file
systems- Linux-Athena has changed the local file system from ext2 to
ext3 as of Athena release 9.0.20, released in January 2002.

It is apparently a silent failure.  You don't find out the backup failed
until you attempt to restore.

Attempts to expand the local file system tree to back up files manually
with the dsm GUI will not reveal any files there, even though they may
exist. Scheduled TSM backups will silently fail without any obvious
error messages in the system log or error files.

---- original message ----
To: amb@MIT.EDU, wdc@MIT.EDU
cc: alexp@MIT.EDU, anger@MIT.EDU, davek@MIT.EDU
Subject: ext3 TSM backup problems
Date: Fri, 22 Feb 2002 12:42:56 -0500
From: Alex T Prengel <alexp@MIT.EDU>


As I believe you both know, there's an issue going around about TSM
backup not working with ext3. One suggestion is reverting to ext2 as
per stasik's message below; another user (kahn- not sure if he's got
straight Linux or Linux-Athena) suggests this also. Would you folks like
to weigh in on this? Messing around with this stuff strikes me as
potentially dangerous...

                                             Alex

PS- amu just replied to stasik's message with:

>Stefan Stasik <stasik@MIT.EDU> writes:

>> reboot the machine.

>How did you do this?  It's possible that your edits didn't actually
>get synched to disk.  To keep this from happening, either run sync
>before rebooting or reboot with "shutdown -r now" rather than "reboot".



------- Forwarded Message

Date: Fri, 22 Feb 2002 12:19:28 -0500 (EST)
From: Stefan Stasik <stasik@MIT.EDU>
To: <linux-help@mit.edu>
cc: <davek@mit.edu>, <wccf-dist@bcs.MIT.EDU>, <alexp@mit.edu>,
        maxr <max@ai.MIT.EDU>
Subject: Converting ALS 9.0.24 from EXT3 to EXT2 root filesystem
Message-ID: <Pine.GSO.4.30L.0202221209520.27210-100000@nerd-xing.mit.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII


Hello:

I am sure many of you are aware of the current TSM backup crisis on
Athena-Linux at the moment.  I am working quickly to figure out how to
make the machines backup again.  The best way I can figure is to force the
ALS partitions, including the ' / ' partition back to EXT2.

I have created a series of steps, to convert the root filesystem back to
EXT2, but I am stuck on one part.

boot the system into single user mode:

at lilo:  linux(smp) single

remount the / partition read-only:

mount -r -o remount /dev/hda7 /

Remove the EXT3 journal option from the filesystem:

/sbin/tune2fs -O^has_journal /dev/hda7

run fsck to check the filesystem:

/sbin/e2fsck /dev/hda7

remount the file system as read-write:

mount -rw -o remount /dev/hda7 /

edit /etc/fstab, and change the entry for ' / ' from 'ext3' to 'ext2'.

emacs /etc/fstab


reboot the machine.

However, when the machine reboots, it is still trying to mount the ' / '
partition as EXT3, and says the / partition is not a EXT3 filesystem, and
kernel panics.

It is starting 'kjournald'  , which seems to be forcing the ext3 mount.
I had assumed by changing the FS tag in fstab, would be enough, but this
is obviously not so, and I am currently stuck.

Where is this happening, and what do you need to change in the boot
procedure to get it to mount the root partition as a EXT2?

Thanks!

- - Stefan Stasik


home help back first fref pref prev next nref lref last post