[573] in arla-drinkers

home help back first fref pref prev next nref lref last post

More info on lockup problem with arla 0.21

daemon@ATHENA.MIT.EDU (Neulinger, Nathan R.)
Thu Feb 4 15:07:15 1999

From owner-arla-drinkers@stacken.kth.se Thu Feb 04 20:07:15 1999
Return-Path: <owner-arla-drinkers@stacken.kth.se>
Delivered-To: arla-drinkers-mtg@bloom-picayune.mit.edu
Received: (qmail 24944 invoked from network); 4 Feb 1999 20:07:14 -0000
Received: from unknown (HELO sundance.stacken.kth.se) (130.237.234.41)
  by bloom-picayune.mit.edu with SMTP; 4 Feb 1999 20:07:14 -0000
Received: (from majordom@localhost)
	by sundance.stacken.kth.se (8.8.8/8.8.8) id VAA02918
	for arla-drinkers-list; Thu, 4 Feb 1999 21:02:24 +0100 (MET)
Received: from umr.edu (hermes.cc.umr.edu [131.151.1.68])
	by sundance.stacken.kth.se (8.8.8/8.8.8) with ESMTP id VAA02914
	for <arla-drinkers@stacken.kth.se>; Thu, 4 Feb 1999 21:02:17 +0100 (MET)
Received: from umr-mail01.cc.umr.edu (umr-mail01.cc.umr.edu [131.151.37.121]) via ESMTP by hermes.cc.umr.edu (8.8.7/R.4.20) id OAA00491; Thu, 4 Feb 1999 14:02:14 -0600 (CST)
Received: by umr-mail01.cc.umr.edu with Internet Mail Service (5.5.2232.9)
	id <D9V037TD>; Thu, 4 Feb 1999 14:02:14 -0600
Message-ID: <9DA8D24B915BD1118911006094516EAF019C7ED3@umr-mail02.cc.umr.edu>
From: "Neulinger, Nathan R." <nneul@umr.edu>
To: "'arla-drinkers@stacken.kth.se'" <arla-drinkers@stacken.kth.se>
Subject: More info on lockup problem with arla 0.21
Date: Thu, 4 Feb 1999 14:02:13 -0600 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2232.9)
Content-Type: text/plain;
	charset="ISO-8859-1"
Sender: owner-arla-drinkers@stacken.kth.se
Precedence: bulk

It locked up on me again. I was however able to turn on debugging. It looks
like the machine is OK, but all AFS access is hanging. When I enabled arlad
debugging with almost-all, I am getting a continuous stream (flooding
syslogs) of:

Feb  4 13:56:33 sysmon arla[357]: multi-sending wakeup: seq = 765152, error
= 0
Feb  4 13:56:33 sysmon arla[357]: worker 0: done
Feb  4 13:56:33 sysmon arla[357]: worker 0 waiting
Feb  4 13:56:33 sysmon arla[357]: worker 0: processing
Feb  4 13:56:33 sysmon arla[357]: Rec message: opcode = 4 (getnode), size =
292
Feb  4 13:56:33 sysmon arla[357]: Multi-send: opcode = 5 (installnode), size
= 384
Feb  4 13:56:33 sysmon arla[357]: multi-sending wakeup: seq = 765153, error
= 0
Feb  4 13:56:33 sysmon arla[357]: worker 0: done
Feb  4 13:56:33 sysmon arla[357]: worker 0 waiting
Feb  4 13:56:33 sysmon arla[357]: worker 0: processing
Feb  4 13:56:33 sysmon arla[357]: Rec message: opcode = 4 (getnode), size =
292
Feb  4 13:56:33 sysmon arla[357]: Multi-send: opcode = 5 (installnode), size
= 384
Feb  4 13:56:33 sysmon arla[357]: multi-sending wakeup: seq = 765154, error
= 0

There doesn't appear to be any XFS activity going on. The only thing that
got written to the log was:

Feb  4 13:58:18 sysmon kernel: xfs_syscall returns error: 0
Feb  4 13:58:18 sysmon kernel: xfs_pioctl
Feb  4 13:58:18 sysmon kernel: xfs_fh_to_dentry: dev: 0 inode: 25
Feb  4 13:58:18 sysmon kernel: xfs_syscall returns error: 7
Feb  4 13:58:18 sysmon kernel: xfs_message_receive opcode = 5
Feb  4 13:58:18 sysmon kernel: xfs_message_installnode
Feb  4 13:58:18 sysmon kernel: xfs_node_find
Feb  4 13:58:18 sysmon kernel: xfs_message_installnode: dp: ceabc7c4
Feb  4 13:58:18 sysmon kernel: xfs_message_installnode: fetching new node
Feb  4 13:58:18 sysmon kernel: new_xfs_node 0.536910899.799.2681
Feb  4 13:58:18 sysmon kernel: xfs_node_find

Killing arlad leads to an extremely unhappy kernel.

-------

As a side note, other than the cache parameters (which I have at 150M and
140M), is there much reason to go with anything other than the defaults in
arla.conf?

-- Nathan

------------------------------------------------------------
Nathan Neulinger                       EMail:  nneul@umr.edu
University of Missouri - Rolla         Phone: (573) 341-4841
Computing Services                       Fax: (573) 341-4216 

home help back first fref pref prev next nref lref last post