[7866] in Release_7.7_team
Re: recovery hook to clean up /boot
daemon@ATHENA.MIT.EDU (Jonathan Reed)
Wed Dec 26 10:27:38 2012
Date: Wed, 26 Dec 2012 10:27:30 -0500 (EST)
From: Jonathan Reed <jdreed@MIT.EDU>
To: Jonathon Weiss <jweiss@MIT.EDU>
cc: release-team@MIT.EDU
In-Reply-To: <544DA0A9-CD71-47E2-8AEB-C93ECCEBB8BD@mit.edu>
Message-ID: <alpine.DEB.2.02.1212261025150.3447@infinite-loop.mit.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
I tested the new update hook on two VMs and one physical machine, and it
worked. It is now active in the locker. I plan on waiting until some
machines have taken their early-afternoon update (~2pm) and we'll see what
explodes. The "abort" sequence consists of:
rm /mit/athena10/update-hook/debathena-update-hook*
If none of the 2pm machines explode (including the ones with completely
full /boots), I'll consider this successful and add the cleanup code to a
new version of the auto-updater.
-Jon
On Fri, 21 Dec 2012, Jonathan Reed wrote:
>
> On Dec 21, 2012, at 2:40 PM, Jonathon Weiss wrote:
>
>>
>> Several comments:
>>
>>
>> 1) I think your mail client may have folded the first line or two.
>>
>> 2) When I tried this on my machine, it prompted for confirmation before
>> removing stuff. I'm not sure what it would do if it were run without a
>> tty, nor amI sure my workstation isn't a special snowflake in this case.
>
> Correct, it wants -y, my bad.
>
>> 3) it makes no attempt to remove the corresponding linux-headers-<vers>
>> ot linux-headers-<vers>-generic. These are certainly less critical in
>> that they don't dump stuff on /boot, but seem like they should probably
>> be handled too.
>
> I'm operating under the "keep it simple" principle here, since cluster machines have ~unlimited local disk (and the recovery hook only operates on cluster machines), and the goal is really to fix /boot. We can definitely come up with something more robust for the auto-updater (because this code, or something similar, should be part of its cleanup).
>
>> 4) It's itempotency is wacky, since the rmoved images are still listed
>> as having config files around so they still get picked up by the
>> dpkg-query, though the later removal doen't remove the packages because
>> they are already gone. So I don't think any real harm is cause here,
>> but it isn't exactly pretty.
>
> Correct, I should check "${Status}" too and see if it's "installed" before attempting to remove.
> Here's version 2:
>
> --------------------- SNIP -----------------------
> #!/bin/bash
>
> kernels=$(dpkg-query -W -f '${Package}:${Status}\n' linux-image-\*-generic | \
> awk -F ':' '$2=="install ok installed" {print $1;}' | \
> sed -e 's/^linux-image-//' | sort -V)
> numkernels=$(echo "$kernels" | wc -l)
> if [ $numkernels -le 2 ]; then
> exit 0
> fi
> toremove=$(echo "$kernels" | head -$(($numkernels-2)))
> kpkgs=
> for k in $toremove; do
> if [ "$(uname -r)" != "$k" ]; then
> kpkgs="$kpkgs linux-image-$k"
> fi
> done
> if apt-get -y -s remove $kpkgs; then
> apt-get -y remove $kpkgs
> fi
>
> --------------------- SNIP -----------------------
>
>
>>
>> --
>>
>> Jonathon
>>
>>
>>
>>
>>
>> Jonathan Reed <jdreed@MIT.EDU> wrote:
>>
>>> We apaprently filled up /boot on a bunch of cluster workstations, and
>>> users are helpfully getting notified about this. We need to clean it
>>> up.
>>>
>>> I propose the following recovery hook. (Cluster workstations should
>>> only have linux-image-generic). I plan to push this out on the 26th
>>> unless people object. (I'm not here on the 21st, and, uh, that's a
>>> bad day to push out additional things)
>>>
>>> Silence will be interpreted as approval.
>>>
>>> #!/bin/bash
>>>
>>> kernels=$(dpkg-query -W -f '${Package}\n' linux-image-\*-generic | sed
>>> -e
>>> 's/^linux-image-//' | sort -V)
>>> numkernels=$(echo "$kernels" | wc -l)
>>> if [ $numkernels -le 2 ]; then
>>> exit 0
>>> fi
>>> toremove=$(echo "$kernels" | head -$(($numkernels-2)))
>>> kpkgs=
>>> for k in $toremove; do
>>> if [ "$(uname -r)" != "$k" ]; then
>>> kpkgs="$kpkgs linux-image-$k"
>>> fi
>>> done
>>> if apt-get -s remove $kpkgs; then
>>> apt-get remove $kpkgs
>>> fi
>>>
>>>
>
>