[577] in arla-drinkers

home help back first fref pref prev next nref lref last post

Re: arla 0.21 Oops on Linux 2.2.1

daemon@ATHENA.MIT.EDU (Chuck Lever)
Thu Feb 4 23:09:47 1999

From owner-arla-drinkers@stacken.kth.se Fri Feb 05 04:09:46 1999
Return-Path: <owner-arla-drinkers@stacken.kth.se>
Delivered-To: arla-drinkers-mtg@bloom-picayune.mit.edu
Received: (qmail 29818 invoked from network); 5 Feb 1999 04:09:46 -0000
Received: from unknown (HELO sundance.stacken.kth.se) (130.237.234.41)
  by bloom-picayune.mit.edu with SMTP; 5 Feb 1999 04:09:46 -0000
Received: (from majordom@localhost)
	by sundance.stacken.kth.se (8.8.8/8.8.8) id FAA06295
	for arla-drinkers-list; Fri, 5 Feb 1999 05:05:08 +0100 (MET)
Received: from elixir.e.kth.se (1073744992@elixir.e.kth.se [130.237.48.5])
	by sundance.stacken.kth.se (8.8.8/8.8.8) with ESMTP id FAA06290
	for <arla-drinkers@stacken.kth.se>; Fri, 5 Feb 1999 05:05:00 +0100 (MET)
Received: from zinfandel.e.kth.se (zinfandel.e.kth.se [130.237.48.172])
	by elixir.e.kth.se (8.9.2/8.9.2) with ESMTP id FAA02067
	for <arla-drinkers@stacken.kth.se>; Fri, 5 Feb 1999 05:04:59 +0100 (MET)
Received: (from lha@localhost)
	by zinfandel.e.kth.se (8.9.2/8.9.2) id FAA21971
	for arla-drinkers@stacken.kth.se; Fri, 5 Feb 1999 05:03:42 +0100 (MET)
Date: Fri, 5 Feb 1999 05:03:42 +0100 (MET)
From: Chuck Lever <cel@monkey.org>
To: Magnus Ahltorp <map@stacken.kth.se>
Subject: Re: arla 0.21 Oops on Linux 2.2.1
In-Reply-To: <ixdiudjyw50.fsf@scup.pdc.kth.se>
Message-ID: <Pine.BSF.3.96.990204172754.11785A-100000@naughty.monkey.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Lines: 46
Xref: mumrik.nada.kth.se mail.private:1696
Sender: owner-arla-drinkers@stacken.kth.se
Precedence: bulk

On 3 Feb 1999, Magnus Ahltorp wrote:
> > so other people on this list are running arla 0.21 on Linux 2.2.1
> > successfully?  it won't run at all for me; it just oops's when i try to
> > "cd" into our afs cell.
>
> I just tested arla 0.21 on Linux 2.2.1. I can't reproduce the oops.

ok, this may be a cascade of smaller failures.  i'll break this down into
smaller bits to see if we can get an idea what's going on...

first, with arladebug set to "all", and running "arlad --no-fork" so i can
catch error messages, i try mounting /afs, and this is what arlad says:

Thu Feb  4 17:24:44 1999: arlad: worker 0: processing
Thu Feb  4 17:24:44 1999: arlad: Rec message: opcode = 2 (getroot), size =
20
Thu Feb  4 17:24:44 1999: arlad: VL_GetEntryByNameN(root.afs): Unknown
error 4294966841
Thu Feb  4 17:24:44 1999: arlad: Failed to contact any db servers in cell
0(citi.umich.edu)
Thu Feb  4 17:24:44 1999: arlad: Cannot find the root volume
Thu Feb  4 17:24:44 1999: arlad: multi-sending wakeup: seq = 8, error =
110
Thu Feb  4 17:24:44 1999: arlad: worker 0: done
Thu Feb  4 17:24:44 1999: arlad: worker 0 waiting

the mount command is successful -- no error is reported, and "df" reports
that /afs is one of the mounted file systems.

i believe that citi.umich.edu doesn't maintain a replica of root.afs.
with earlier releases, i would also get a warning, but things worked after
that.  is there a way to decode 4294966841, or is this a sign of error
code value mismatches between our AFS servers and arla?

the "Failed to contact" message started appearing after upgrading to 0.21.

according to tcpdump, a udp packet goes to the afs3-vlserver port on
babble.citi, and babble responds with a packet.  so arla is pinging the
correct vlserver host, and the host is responding.

        - Chuck Lever
--
corporate:      <chuckl@netscape.com>
personal:       <chucklever@netscape.net> or <cel@monkey.org>

home help back first fref pref prev next nref lref last post