[444] in athena10
Login chroot design for Athena 10
daemon@ATHENA.MIT.EDU (ghudson@MIT.EDU)
Tue Aug 19 13:27:09 2008
Date: Tue, 19 Aug 2008 13:26:03 -0400 (EDT)
From: ghudson@MIT.EDU
Message-Id: <200808191726.m7JHQ3kp013052@outgoing-legacy.mit.edu>
To: athena10@mit.edu
Since Ken Arnold raised the point that login chroots would simplify
the update architecture, I'm going to take some time this week to
explore implementing them.
The basic idea would be a three-tiered logical volume setup:
* login-master is the long-term, mutable chroot.
* login-stable is a snapshot of login-master made after an update.
* login is an ephemeral snapshot of login-stable used for a session.
(Assumption: LVM supports snapshots of snapshots. If it does not,
then the corner cases get a little uglier. We can create login from
login-master just before updating login-master, but what if someone
does a fast login and logout before the update completes?)
Some considerations:
* We need to pick a size for the logical volumes--small enough that
we don't overfill cluster machine disks and big enough that we
don't run out of space for software packages. /tmp and /var/tmp
will be coming from the host so that's not an issue. I'm not sure
if snapshots
* There are some concurrency issues to worry about with
login-stable, as Ken pointed out.
* The cluster-software package wants to be installed in
login-master, not on the host. (So it gets removed from
debathena-cluster.)
* login-master probably wants debathena-workstation installed, not
debathena-cluster. (In particular, we don't want the login chroot
to be trying to create its own interior login chroot!)
* If we use the chroot via schroot in an /etc/X11/Xsession.d script,
then the failsafe session won't use the chroot unless we go to
extra effort to do so. There are some positive and negative
ramifications of that; it makes it easier to get root on the host,
and it means cluster-software won't be available in a failsafe
session.
So the task breakdown is:
1. Verify that snapshots of snapshots work in LVM. If they don't,
back to the drawing board.
2. Pick a discipline for handling the concurrency issues around
login-stable.
3. Implement a package which (a) creates and updates login-master,
and (b) manages login-stable.
4. Implement a package taking over the gdm PreSession and
PostSession scripts. In PreSession, snapshot login-stable to
create login and set it up in schroot. In PostSession, kill all
user processes and destroy login. Also add an Xsesssion script
to schroot
5. debathena-auto-update still needs to exist to keep the host up to
date, but can be redesigned to stop interacting with gdm for the
most part. (It still needs to reboot the machine after a kernel
upgrade; I forgot about that requirement in the current
implementation.)
I'm still a bit nervous that we'll discover software that can't be
made to work in a chroot. This is definitely forging into less
charted territory, which can mean higher maintenance costs. But it
does have substantial advantages for cleanup and updates.