[1420] in linux-net channel archive

home help back first fref pref prev next nref lref last post

Re: Strange behaviour with NFS

daemon@ATHENA.MIT.EDU (Ed Carp [khijol SysAdmin])
Thu Nov 23 19:42:22 1995

From: "Ed Carp [khijol SysAdmin]" <khijol!erc@vger.rutgers.edu>
To: khijol!devel.ipacific.net.au!inet:becker@vger.rutgers.edu (Donald Becker)
Date: Thu, 23 Nov 1995 14:37:58 -0600 (CST)
Cc: khijol!vger.rutgers.edu!linux-net@vger.rutgers.edu
In-Reply-To: <199511230050.LAA27582@warp.ipacific.net.au> from "Donald Becker" at Jul 26, 95 00:15:00 am
Reply-To: khijol!netcom.com!ecarp@vger.rutgers.edu

-----BEGIN PGP SIGNED MESSAGE-----

> >From: joey@finlandia.Infodrom.North.DE (Martin Schulze)
> >Subject: Strange behaviour with NFS
> 
> >I found some strange behaviour according to the nfs filesystem in the
> >kernel.
> >
> >Whenever the nfs server isn't reachable the process on the client
> >machine just hangs around, partially in 'D' status which means
> >non-interruptable.
> 
> I also encountered this problem today.  This bug hung a few nodes on our
> Linux cluster.  Luckily one still had a few process slots left to figure to
> do a 'ps'.  Here a few notes:
> 	1. The processes were in the 'D' disk-wait state.  Most were
> 	swapped out.
> 	2. The processes counted toward the load average, but didn't consume
> 	CPU time.  The load average on the still-working machine was >35.
> 	3. Doing 'kill -1' and 'kill -9' had no effect.  The processes
> 	didn't even turn into zombies.
> 
> The problem apparently started when a few NFS serving nodes were
> unavailable.  I assume NFS clients might have timed out.  Even when the
> servers were returned to service the processes were still hung.
> 
> Donald Becker				 becker@cesdis.gsfc.nasa.gov
>  USRA Center of Excellence in Space Data and Information Sciences.
>  Code 930.5,  Goddard Space Flight Center,  Greenbelt, MD.  20771
>  301-286-0882	     http://cesdis.gsfc.nasa.gov/pub/people/becker/whoiam.html

Um, this isn't limited to Linux - Solaris also exhibits this behavior.  When one of our NFS 
servers dumps the big one, we often have to reboot the clients.  Someone suggested that the 
mounts be changed from 'hard' to 'soft', but I don't think this would make any difference.

Suggestions welcome...
- --
Ed Carp, N7EKG    			Ed.Carp@linux.org, ecarp@netcom.com
					214/993-3935	voicemail/pager
Finger ecarp@netcom.com for PGP 2.5 public key		an88744@anon.penet.fi

Q.	What's the trouble with writing an MS-DOS program to emulate Clinton?
A.	Figuring out what to do with the other 639K of memory.

-----BEGIN PGP SIGNATURE-----
Version: 2.6.2

iQCVAwUBMLTbpCS9AwzY9LDxAQGcKgP/Y3R5fZcC4jPInzNxeMOr5LjfjeBX2lsi
O6/hEPfxQKVcI3fE0XZE7t8cRohkMjL4vV69NEVJhc3VMgrvFZqwoRgGsz79eL8g
ezweNAjUxzwA9xVylBqo8pDhKoyUoMS2b603Jff63ckAlvRyGLaI8yJ5q+h5TTkq
zev6sqRxebs=
=xLru
-----END PGP SIGNATURE-----

home help back first fref pref prev next nref lref last post