[7160] in testers
early-test in w20
daemon@ATHENA.MIT.EDU (Jonathon Weiss)
Thu Jun 16 23:11:47 2005
Message-Id: <200506170311.j5H3BWvl006914@detraction.mit.edu>
From: Jonathon Weiss <jweiss@MIT.EDU>
To: testers@MIT.EDU
Date: Thu, 16 Jun 2005 23:11:32 -0400
Well, the good news is that we went 4 for 4 updating suns to 9.4 in
the w20 cluster.
The bad news is that we went 0 for 5 on the linux machines. I beleive
these all suffered some sort of "rpmupdate: failed to complete
transactions" problem as the root caue of the lossage. However, the
observed symptoms were different:
w20-575-3{1,2,4} all reported grub being unable to find the kernel to
boot it. I looked at 34 (boot off install cd, mount and chroot to the
on-disk fileystem), and indeed the 2.4 kernel was still there, and the
2.6 one wasn't. I'm not sure what really happened, the update.log
said the kernel rpm was upgraded, but the running the rpm command
reported an error the first time, and hung all future times (even
after a reboot). I saved a copy of the install.log and re-installed
this machine. I left #31 and #32 in case anyoen else wanted to look
at one.
#33 had the right kernel, but trashed itself badly enoug that it
couldn't start the network or boot. rpm worked ant told me there were
a pile of duplicate rpms in the rpmdb. I re-installed it.
35 might have been in the same state, it was a little hard to tell.
It needed fscking on the cache partition before I did anything else.
It had enough network that I was able to get off copies of the version
file and update log (it had apparently trie to update to 9.4 a couple
of times). It also had a slightly screwed up grub.conf (converted
from lilo by the update), but I don't know if the problems there were
the result of taking the update multiple times. After I booted it it
couldn't start the net or boot. I left it in case anyone wantd to
look.
The grub.conf is below:
# grub.conf created from lilo.conf by Linux Athena migration script.
timeout=5
splashimage=(hd0,0)/grub/splash.xpm.gz
default=2
title Red Hat Enterprise Linux WS (2.6.9-5.EL)
kernel /vmlinuz-2.6.9-5.EL ro root=/dev/hda7
initrd /initrd-2.6.9-5.EL.img
title Linux-Athena (2.6.9-5.EL)
kernel /vmlinuz-2.6.9-5.EL ro root=/dev/hda7
initrd /initrd-2.6.9-5.EL.img
title Linux-Athena (2.6.9-5.EL) (single user mode)
kernel /vmlinuz-2.6.9-5.EL ro root=/dev/hda7
initrd /initrd-2.6.9-5.EL.img
title Linux-Athena (2.6.9-5.EL) (single user mode) (single user mode)
kernel /vmlinuz-2.6.9-5.EL ro root=/dev/hda7 single
initrd /initrd-2.6.9-5.EL.img
Both of the re-installed machines are now updating to 9.4 again, and
we'll see what happens.
Jonathon