[77] in Kakapo Windows Team
AFS &/or XP issues
daemon@ATHENA.MIT.EDU (Thomas L. Thornton)
Fri Aug 22 16:50:08 2003
Date: Fri, 22 Aug 2003 16:50:04 -0400 (EDT)
Message-Id: <200308222050.h7MKo4tE017345@the-rim.mit.edu>
From: "Thomas L. Thornton" <tomt@MIT.EDU>
To: contact-container-admins@MIT.EDU
CC: kakapo@MIT.EDU, asanka@MIT.EDU
Container admins,
Two non-critical but discomforting issues appear on currently deployed
win.mit.edu machines. There is a failing OpenAFS service and an XP
startup delay. Here follows a statement on where we stand with them.
First, the OpenAFS service, afsd, does not restart. Although this
service is notorious for failing, it had been overcome in the previous
IBM AFS by setting the OS to routinely restart it. Phil Thompson in
case 410179 sees perhaps a quarter of his machines fail to start the
service or fail to restart it, details are unclear. We see that the
service-restart setting that used to reside in and run from the IBM
AFS installer now is missing from the OpenAFS 08-06-2003 installer, so
it must be reproduced. Today it is in our AFS code base, building and
creating an effective installation, so we expect this problem to
disappear upon our next AFS release.
Container admins who actually see afsd failing may want to try
manually setting the failure recovery actions. It is easy to do:
o Open up the Services snap-in in Administrative Tools (or type
`services.msc` at a command prompt).
o Double click the service named "TransarcAFS Client" (or
right-click and select "Properties").
o Click on the "Recovery" tab.
In the "First failure" listbox, select "Restart the Service."
In the "Second failure" listbox, select "Restart the Service."
In the "Subsequent failures" listbox, leave "Take No Action."
In the "Reset fail count after:" editbox, leave "0" days.
In the "Reset service after:" editbox, enter "0" minutes.
o Click OK.
o Close the Services snap-in.
Second, XP machines exhibit a startup delay. With the current Pismere
and OpenAFS installers, we see XP machines take about 2.5 minutes
during the "Applying Computer Settings" interval of startup, perhaps
too long but tolerable. This seems constant across a variety of
hardware platforms. Asanka Herath comments that he observes this even
on machines without OpenAFS or a loopback.
However, some report their machines take significantly longer. Dan
Stratila in case 404542 sees his machine take over 5 minutes and Barry
Stoelzel reports the same delay. Chad Dupuis in 408880 says that
37-312 machines, names not reported, take 12 minutes, perhaps since he
sets a couple other important services like Auto Update to start
there.
Experimenting with at least two machines shows that changing the order
of listing network providers can reduce startup time back down to 2.5
minutes. Nevertheless, we cannot in good faith recommend that admins
try this, since altering this ordering breaks the ability to remove
the current OpenAFS installation. A fix to permit reordering,
retaining uninstall, is in our AFS code base and test installers.
We may need to look deeper into the perhaps-inevitable 2.5 minute
delay and vagaries of network provider reordering. Paul and Asanka
are trying to gauge the time required to do this and will have a
tentative date for us soon, on the order of days.
-Tom