[1172] in Release_7.7_team

home help back first fref pref prev next nref lref last post

Re: Problems with NFS timeouts

daemon@ATHENA.MIT.EDU (Ezra Peisach)
Wed Dec 17 06:49:31 1997

To: Jonathon Weiss <jweiss@MIT.EDU>
Cc: athena-rcc@MIT.EDU, ezra@MIT.EDU, f_l@MIT.EDU, haldane@MIT.EDU, jf@MIT.EDU,
        karen@MIT.EDU, mbwall@MIT.EDU, mshiffer@MIT.EDU, network@MIT.EDU,
        nschmidt@MIT.EDU, ops@MIT.EDU, phils@MIT.EDU, release-team@MIT.EDU,
        takehiko@MIT.EDU, tfitz@MIT.EDU, thg@MIT.EDU, tom@MIT.EDU
In-Reply-To: Your message of "Tue, 16 Dec 1997 20:49:05 EST."
             <199712170149.UAA10328@the-other-woman.MIT.EDU> 
Date: Wed, 17 Dec 1997 06:49:23 EST
From: Ezra Peisach <epeisach@MIT.EDU>


I have seen the failure as well between a solaris client and a Dec Alpha
server.

I checked the man page for the Decstations and they imply the same
retry scheme - but I remember from the good old days of BSD and Ultrix
that the retry was an exponential backoff. I.e. so a retry count of 5
and timeo of 8 meant at 0.8 seconds, 3.2 seconds, 12.8, ... etc - up
to a rety timeout of 30 seconds. So a retry count of 5 could add up to
a lot.  (for those interested in the code bsd-4.3 sources
common/sys/nfs/nfs_subr.c, netbsd sources have a sys/nfs/nfs_socket.c,
linux in fs/nfs/inode.c). The different platforms have slightly
different multipliers, but the concept is the same.  This was done to
also ensure you don't saturate the network with retries.

I suspect that Sun has changed this concept of timeout and retry count
(which is amusing as Ultrix and our BSD 4.3 NFS were based on Sun's
NFS implementation). I do not have access to the knfs/Sun source tree and
cannot verify this fact.

If Sun has done away with this scheme, then attach.conf should be
changed to reflect the Solaris change in methodology. One can argue
that Sun should be shot for not having a backoff scheme (if they
don't) - but Sun will be Sun.

	Ezra


home help back first fref pref prev next nref lref last post