[41009] in Hotline Meeting

home help back first fref pref prev next nref lref last post

Network and/or athena NFS problem

daemon@ATHENA.MIT.EDU (Fawyn Herd)
Thu Dec 4 10:27:59 1997

To: hotline@MIT.EDU
Cc: accounts@MIT.EDU
Reply-To: accounts@MIT.EDU
In-Reply-To: [80233]
Date: Thu, 04 Dec 1997 10:27:58 EST
From: Fawyn Herd <fawyn@MIT.EDU>


Hi,

This was forwarded to the Accounts Office from the HelpDesk, but it look
like it's for your office.
======================


 Software: Unknown
 Software Version: 
 Software Category: 
 IP Address: 
 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 Created by: client
 Created on: 11/26/97 18:35
 Modified by: client
 Modified on: 11/26/97 18:35
 Closed by: 
 Closed by: 
 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 History Text: Date: Wed, 26 Nov 1997 18:35:27 EST
To: computer-help@MIT.EDU
From: Tom Fitzgerald <tfitz@MIT.EDU>
Subject: Network and/or athena NFS problem
 
We're having a serious problem with our athena workstations.
 
Many users around here are seeing a problem accessing NFS lockers on
athena workstations, from other athena workstations in other buildings.
When they do even the simplest accesses, even ls or pwd (when cd'd
into the locker), they get an immediate NFS timeout and no data.
 
We've seen this on servers cat2 and archfile (both in 3-409), from
clients kahn, napoli and dentil in 5-414, and from cat1 in building 10.
All these systems are Sparcs, so the NFS connection is going over TCP.
 
The NFS lockers are mounted with what seem to be the athena defaults:
 
/mit/archfile00 on ARCHFILE.MIT.EDU:/export/u0 read/write/nosuid/soft/rsize=102
4/wsize=1024/timeo=8/retrans=7/remote on Mon Nov 24 16:47:34 1997
 
As often as not, re-running the command works fine, but the users are
very upset that applications crash with file format errors when timeout
errors happen during reading, and sometimes it takes several tries to
save a modified file back to disk.
 
I'd like to emphasize that the timeouts are instant, which is a mystery
given the timeo=8/retrans=7 mount options; this should give over 5 seconds
before a timeout, but the error message really does follow the command
immediately (I've seen this).
 
The problems are much more frequent when the servers are heavily loaded,
but the servers are definitely not reporting any errors themselves, and
keep running fine.
 
Any help would be appreciated - thanks....
 
 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 
--[80233]--

home help back first fref pref prev next nref lref last post