[16440] in Athena Bugs

home help back first fref pref prev next nref lref last post

Sun Ultra-10 crash

daemon@ATHENA.MIT.EDU (Thomas Bushnell, BSG)
Fri Oct 23 18:37:09 1998

Date: Fri, 23 Oct 1998 18:37:08 -0400
From: tb@MIT.EDU (Thomas Bushnell, BSG)
To: bugs@MIT.EDU
Cc: jhawk@MIT.EDU


A Sun Ultra-10 user was copying files from the clowd locker into
/var/tmp.  Several odd things were observed.  The files are all less
163938 bytes.

He used a command like 

mkdir /var/tmp/yegg/; cp /mit/clowd/yegg/*.* /var/tmp/yegg/

Doing periodic `ls' in another window, as the first file is copied, it
begins copying multiple megabytes; then doing ls in the clowd locker
list the file (incorrectly) as about 47 Mbytes.  This does not happen
always, but does happen often enough.

The user reports (though I did not see) the same behavior using an
SGI.  Because the problem is not constant, with persistence he could
successfully copy the file with the right size.

Periodically the cp process reports NFS errors indicating malformed
replies from the NFS server of the clowd locker.  I do not know if the
same problems happened on an SGI.

After copying ~20 files, the Sun crashes.  The crash (according to the
user) happens reliably.

I have the errors it printed on the console written out; I'll provide
them on request.  The core dump may be found in
/mit/bitbucket/tb-core.  

Clearly the NFS server for the clowd locker (callisto.mit.edu) is
defective, but a defective NFS server should not be able to provoke
kernel panics.

Thomas

home help back first fref pref prev next nref lref last post