[7927] in testers

home help back first fref pref prev next nref lref last post

Re: 3 comments

daemon@ATHENA.MIT.EDU (Michael Khusid)
Fri Mar 13 17:06:34 2009

Message-ID: <49BACA95.1070508@mit.edu>
Date: Fri, 13 Mar 2009 17:05:25 -0400
From: Michael Khusid <mkhusid@MIT.EDU>
MIME-Version: 1.0
To: testers@mit.edu
In-Reply-To: <1236975520.813.83.camel@guru.mit.edu>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit

Ironically, I reproduced the same hang when I logged out on m66-080-2. 

Based on my experience with file systems, this looks like openafs bug as 
it is unable to close file handles on time -- and Ubuntu gives up and 
closes network connection which causes crash.

Do we have crash dump capability on cluster machines?

Mike

Christine L Moulen wrote:
> On Fri, 2009-03-13 at 15:27 -0400, Michael Khusid wrote:
>   
>> 1. I just ran into a hung Debathena workstation m66-080-1.  I can see a 
>> kernel panic on the screen.  The machine has not rebooted.
>>
>> In the trace, the system was going into a reboot.  Tried to unmount AFS, 
>> failed for some reason.  Then it wasn't able to unload kernel module 
>> openafs.  Afterwards, it segfaulted on the "unmounting local filesystems".
>>     
>
> I've had my workstation lock up like this too.  Network was flaky, lost
> AFS, and it could not unmount AFS on a reboot attempt.  I eventually
> power-cycled it.
>
>   

home help back first fref pref prev next nref lref last post