[1325] in testers

home help back first fref pref prev next nref lref last post

Re: Restart cleanup bug

daemon@ATHENA.MIT.EDU (daemon@ATHENA.MIT.EDU)
Tue Dec 4 18:20:15 1990

Date: Tue, 4 Dec 90 18:19:49 -0500
To: dkk@MIT.EDU
Cc: testers@MIT.EDU
In-Reply-To: David Krikorian's message of Tue, 4 Dec 90 04:21:05 -0500,
From: Richard Basch <probe@MIT.EDU>


   Date: Tue, 4 Dec 90 04:21:05 -0500
   From: David Krikorian <dkk@ATHENA.MIT.EDU>
   Reply-To: dkk@mit.edu
   Home: 47 Lake St., Arlington, MA 02174, (617) 646-9289
   Office: MIT Bldg. E40-358A, (617) 253-8651, 258-8736 (fax)


   When the cleanup runs, there is an /etc/nologin present.  On Alecto, a
   vs2 auto-updated to 7.2C, I logged in on the console (with ^P) and
   caused the /etc/nologin to stay around.  The symptoms include an error
   in the console window:

     04:11 cleanup: /etc/nologin already exists, not performing cleanup.

   and the following:

   alecto# ls -l /etc/nologin ; cat /etc/nologin ; last -1
   -rw-r--r--  1 root           62 Dec  4 01:18 /etc/nologin
   This machine is down for cleanup; try again in a few seconds.
   root      console                   Tue Dec  4 01:18 - 01:39  (00:20)

   The time on the /etc/nologin is the same as my login time on the
   console.  For detail on what happened at 01:18, read on.

   When I typed ^P the first time, I got the login banner (because I
   pressed the control key).  When I typed ^P the second time, I got
   "Console login requested" but no login prompt.  A few (<< 60) seconds
   later I got the login banner again, and typed ^P a third time.  This
   time I got a login prompt and logged in as root.

Perhaps the machine was in the middle of a "reactivate" at the moment.
If this is the case, we hope that this bug will not be present in 7.2D.
However, since we have no way of reproducing this exact timing, it is a
problem that we can't be sure that is resolved.

Anyway, in 7.2D, we have forced reactivate's output to go to
/dev/console, if it is not being run by a user, so that if ttyv0 is
closed (such as during a ctrl-p), it will not cause reactivate to get a
SIGTTOU (if any errors messages are generated).  In addition, reactivate
will now ignore HUP and TERM signals that might have been propogated by
the parent process dying (and possibly having the shell pass these on to
cleanup, at an inopportune moment, thereby leaving /etc/nologin).

As you can see, a lot of precautions have been added to 7.2D, based on
various suppositions, but we may still be overlooking another possible
cause of what you have described.  If you have noted anything else that
might pertain to this situation, please let us know.

Thanks,
-Richard

home help back first fref pref prev next nref lref last post