[598] in arla-drinkers

home help back first fref pref prev next nref lref last post

full debug log from arla lockup/non-responsiveness

daemon@ATHENA.MIT.EDU (Neulinger, Nathan R.)
Thu Feb 11 12:31:44 1999

From owner-arla-drinkers@stacken.kth.se Thu Feb 11 17:31:44 1999
Return-Path: <owner-arla-drinkers@stacken.kth.se>
Delivered-To: arla-drinkers-mtg@bloom-picayune.mit.edu
Received: (qmail 1197 invoked from network); 11 Feb 1999 17:31:43 -0000
Received: from unknown (HELO sundance.stacken.kth.se) (130.237.234.41)
  by bloom-picayune.mit.edu with SMTP; 11 Feb 1999 17:31:43 -0000
Received: (from majordom@localhost)
	by sundance.stacken.kth.se (8.8.8/8.8.8) id SAA15760
	for arla-drinkers-list; Thu, 11 Feb 1999 18:25:06 +0100 (MET)
Received: from umr.edu (hermes.cc.umr.edu [131.151.1.68])
	by sundance.stacken.kth.se (8.8.8/8.8.8) with ESMTP id SAA15746
	for <arla-drinkers@stacken.kth.se>; Thu, 11 Feb 1999 18:24:36 +0100 (MET)
Received: from umr-mail01.cc.umr.edu (umr-mail01.cc.umr.edu [131.151.37.121]) via ESMTP by hermes.cc.umr.edu (8.8.7/R.4.20) id LAA11624; Thu, 11 Feb 1999 11:24:34 -0600 (CST)
Received: by umr-mail01.cc.umr.edu with Internet Mail Service (5.5.2232.9)
	id <1NDN131N>; Thu, 11 Feb 1999 11:24:34 -0600
Message-ID: <9DA8D24B915BD1118911006094516EAF019C7EFB@umr-mail02.cc.umr.edu>
From: "Neulinger, Nathan R." <nneul@umr.edu>
To: "'arla-drinkers@stacken.kth.se'" <arla-drinkers@stacken.kth.se>
Subject: full debug log from arla lockup/non-responsiveness
Date: Thu, 11 Feb 1999 11:24:33 -0600
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2232.9)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-arla-drinkers@stacken.kth.se
Precedence: bulk

I set both the xfs and arla debugging to all. 

The lockup seems to affect particular directories. Also, if the directory
gets in this state (where it locks up) a flushv on the directory will
completely freeze the system  to where it will answer pings but nothing
else.

The full debug log is here:

http://www.umr.edu/~nneul/debug-traces/arla-webindex-19990211-ls

The first is the output from 'ls directory' then waiting a while, then
control-C. 

The flushv only got the following before locking up the machine (no oops):

	Feb 11 10:59:15 webindex arla[361]: probe (192.65.97.1)
	Feb 11 10:59:20 webindex arla[361]: worker 0: processing
	Feb 11 10:59:20 webindex arla[361]: Rec message: opcode = 22
(pioctl), size = 2096
	Feb 11 10:59:20 webindex arla[361]: sending wakeup: seq = 1273148,
error = 22

The arla config is:

	high_vnodes 4000
	low_vnodes 3000
	numcreds 100
	numconns 100
	numvols 100
	fpriority 100
	high_bytes 150M
	low_bytes 140M

The cache status shortly after the time of the hangup:

	Arla is using 2929 of the cache's available 153600 1K byte blocks
	(and 805 of the cache's available 4000 vnodes)

If there is anything else you want me to get?

-- Nathan

------------------------------------------------------------
Nathan Neulinger                       EMail:  nneul@umr.edu
University of Missouri - Rolla         Phone: (573) 341-4841
Computing Services                       Fax: (573) 341-4216 

home help back first fref pref prev next nref lref last post