[3407] in Release_7.7_team

home help back first fref pref prev next nref lref last post

state of the release

daemon@ATHENA.MIT.EDU (Jonathon Weiss)
Wed Jul 17 06:02:33 2002

Message-Id: <200207171002.GAA02075@bearing-an-hourglass.mit.edu>
From: Jonathon Weiss <jweiss@MIT.EDU>
To: release-team@MIT.EDU
cc: ops@MIT.EDU
Date: Wed, 17 Jul 2002 06:02:31 -0400


I did a cluster walk tonight to check out the release. I found a
couple of non-fatal bugs in the linux release, and one bug in solaris
(the OS, not the release itself).

1) The rpm verification force-installs a lot of RPMs (possibly a huge
number on the first reboot after the update and only a lot on
subsequent boots. :-) By itself this doesn't really cause any problems
other than to slow down the booting process.  Still, it is
asthetically poor, and it makes it harder to use pieces of the
verification scripts to debug a machine, because there are so many
false positives.  Andrew can you take a look at this once the video
card stuff is sorted out?

2) On the initial boot after taking 9,1 (and I'm 95% sure it's limited
to the initial boot), linux machines try and fail to start X while the
RPM verification is still in progress.  the failure mode is that the
screen flashes a few times, and then it whines about X respawning too
rapidly and disables it for 5 minutes.  At the end of the 5 minutes
(or sometimes 5 minutes after that, and I may have seen a machine go
for 15 minutes, but I'm not sure) it tries to restart X and succeeds
and xlogin pops up.  I determined that the verify was still running by
popping between different vt's.  This has the potential to confuse
people if they're around when they're machine finishes the update, but
in all cases I've seen, the machine recovers on its own if you let it.

3) I found a double handful of solaris machine that hadn't updated yet
(in 37 and w20).  The problem appears to have been that they were
newely installed machines and had only booted once after they were
installed.  I gather that sometime shortly after cron started, the
time was reset (presumably by gettime/ntp) getnerally back a few
hours.  Cron logged a warning about time running backwards and
apparently stopped running some or all cron jobs (including
reactivate).  I restarted cron on these machines and it seems to have
solved the problem.


Little notable or amusing things I also found:

1) 1 linux box that failed to reboot after the update because a user
left a floppy in the drive.

2) a user idle for more than a week (in an electronic classroom)

3) 1 sun hung with one of the known bad xscreensaver hacks (presumably
in the user's config file).  The X server patch will be in the next
patch release, right?


	Jonathon

home help back first fref pref prev next nref lref last post