[19890] in Athena Bugs

home help back first fref pref prev next nref lref last post

Machines which fail to take updates allow logins but don't run

daemon@ATHENA.MIT.EDU (Mitchell E Berger)
Sat Oct 13 19:22:19 2001

Message-Id: <200110132322.TAA23714@athenaphobia.mit.edu>
To: bugs@MIT.EDU
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Sat, 13 Oct 2001 19:22:17 -0400
From: Mitchell E Berger <mitchb@MIT.EDU>

The other day, a friend of mine mentioned that he was working on a cluster 
machine and every so often, a message about being in the middle of an update 
would pop up in the console.  Yesterday, I logged into a cluster machine that 
happened to be in this state.  I determined that the cause is that the machine 
died while taking the aborted 9.0.17 patch release on October 2nd, and was 
rebooted without being fixed.

Usually, logins aren't permitted during updates because reactivate creates an 
/etc/nologin file before starting the autoupdate.  Unfortunately, the athena 
rc script blows the file away on boot unconditionally, which means that a 
machine that loses during an update will allow logins if you reboot it.  I 
found at least 5 cluster machines in this state by scanning the public-linux 
cluster with athinfo.

This is more than a minor annoyance; it's a security hole.  reactivate quits 
without doing anything if it finds the machine not at a real Athena version 
(i.e. at Version Update).  This means that machines which aren't running their 
cleanup scripts and closing leftover security issues are allowing users to log 
in and don't appear broken.  I think the right thing to do here is to 
conditionalize removing /etc/nologin on the machine not being in the middle of 
an update.  That would do 2 things:

- Prevent normal users from logging into a broken machine.
- Cause the machine to cry for help when someone tried to log in.

I'm submitting a patch to do this.

Mitch


home help back first fref pref prev next nref lref last post