[7866] in Release_7.7_team

home help back first fref pref prev next nref lref last post

Re: recovery hook to clean up /boot

daemon@ATHENA.MIT.EDU (Jonathan Reed)
Wed Dec 26 10:27:38 2012

Date: Wed, 26 Dec 2012 10:27:30 -0500 (EST)
From: Jonathan Reed <jdreed@MIT.EDU>
To: Jonathon Weiss <jweiss@MIT.EDU>
cc: release-team@MIT.EDU
In-Reply-To: <544DA0A9-CD71-47E2-8AEB-C93ECCEBB8BD@mit.edu>
Message-ID: <alpine.DEB.2.02.1212261025150.3447@infinite-loop.mit.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed

I tested the new update hook on two VMs and one physical machine, and it 
worked.  It is now active in the locker.  I plan on waiting until some 
machines have taken their early-afternoon update (~2pm) and we'll see what 
explodes.  The "abort" sequence consists of:
    rm /mit/athena10/update-hook/debathena-update-hook*
If none of the 2pm machines explode (including the ones with completely 
full /boots), I'll consider this successful and add the cleanup code to a 
new version of the auto-updater.

-Jon


On Fri, 21 Dec 2012, Jonathan Reed wrote:

>
> On Dec 21, 2012, at 2:40 PM, Jonathon Weiss wrote:
>
>>
>> Several comments:
>>
>>
>> 1) I think your mail client may have folded the first line or two.
>>
>> 2) When I tried this on my machine, it prompted for confirmation before
>> removing stuff.  I'm not sure what it would do if it were run without a
>> tty, nor amI sure my workstation isn't a special snowflake in this case.
>
> Correct, it wants -y, my bad.
>
>> 3) it makes no attempt to remove the corresponding linux-headers-<vers>
>> ot linux-headers-<vers>-generic.  These are certainly less critical in
>> that they don't dump stuff on /boot, but seem like they should probably
>> be handled too.
>
> I'm operating under the "keep it simple" principle here, since cluster machines have ~unlimited local disk (and the recovery hook only operates on cluster machines), and the goal is really to fix /boot.  We can definitely come up with something more robust for the auto-updater (because this code, or something similar, should be part of its cleanup).
>
>> 4) It's itempotency is wacky, since the rmoved images are still listed
>> as having config files around so they still get picked up by the
>> dpkg-query, though the later removal doen't remove the packages because
>> they are already gone.  So I don't think any real harm is cause here,
>> but it isn't exactly pretty.
>
> Correct, I should check "${Status}" too and see if it's "installed" before attempting to remove.
> Here's version 2:
>
> --------------------- SNIP -----------------------
> #!/bin/bash
>
> kernels=$(dpkg-query -W -f '${Package}:${Status}\n' linux-image-\*-generic | \
>    awk -F ':' '$2=="install ok installed" {print $1;}' | \
>    sed -e 's/^linux-image-//' | sort -V)
> numkernels=$(echo "$kernels" | wc -l)
> if [ $numkernels -le 2 ]; then
>   exit 0
> fi
> toremove=$(echo "$kernels" | head -$(($numkernels-2)))
> kpkgs=
> for k in $toremove; do
>   if [ "$(uname -r)" != "$k" ]; then
> 	kpkgs="$kpkgs linux-image-$k"
>   fi
> done
> if apt-get -y -s remove $kpkgs; then
>   apt-get -y remove $kpkgs
> fi
>
> --------------------- SNIP -----------------------
>
>
>>
>> --
>>
>> 	Jonathon
>>
>>
>>
>>
>>
>> Jonathan Reed <jdreed@MIT.EDU> wrote:
>>
>>> We apaprently filled up /boot on a bunch of cluster workstations, and
>>> users are helpfully getting notified about this.  We need to clean it
>>> up.
>>>
>>> I propose the following recovery hook.  (Cluster workstations should
>>> only have linux-image-generic).   I plan to push this out on the 26th
>>> unless people object.  (I'm not here on the 21st, and, uh, that's a
>>> bad day to push out additional things)
>>>
>>> Silence will be interpreted as approval.
>>>
>>> #!/bin/bash
>>>
>>> kernels=$(dpkg-query -W -f '${Package}\n' linux-image-\*-generic | sed
>>> -e
>>> 's/^linux-image-//' | sort -V)
>>> numkernels=$(echo "$kernels" | wc -l)
>>> if [ $numkernels -le 2 ]; then
>>>    exit 0
>>> fi
>>> toremove=$(echo "$kernels" | head -$(($numkernels-2)))
>>> kpkgs=
>>> for k in $toremove; do
>>>    if [ "$(uname -r)" != "$k" ]; then
>>> 	kpkgs="$kpkgs linux-image-$k"
>>>    fi
>>> done
>>> if apt-get -s remove $kpkgs; then
>>>    apt-get remove $kpkgs
>>> fi
>>>
>>>
>
>

home help back first fref pref prev next nref lref last post