[776] in arla-drinkers

home help back first fref pref prev next nref lref last post

Re: severe cache coherency problem

daemon@ATHENA.MIT.EDU (Love)
Thu Apr 22 14:20:50 1999

From owner-arla-drinkers@stacken.kth.se Thu Apr 22 18:20:50 1999
Return-Path: <owner-arla-drinkers@stacken.kth.se>
Delivered-To: arla-drinkers-mtg@bloom-picayune.mit.edu
Received: (qmail 21240 invoked from network); 22 Apr 1999 18:20:48 -0000
Received: from unknown (HELO sundance.stacken.kth.se) (130.237.234.41)
  by bloom-picayune.mit.edu with SMTP; 22 Apr 1999 18:20:48 -0000
Received: (from majordom@localhost)
	by sundance.stacken.kth.se (8.8.8/8.8.8) id UAA24784
	for arla-drinkers-list; Thu, 22 Apr 1999 20:14:01 +0200 (MET DST)
Received: from elixir.e.kth.se (elixir.e.kth.se [130.237.48.5])
	by sundance.stacken.kth.se (8.8.8/8.8.8) with ESMTP id UAA24776
	for <arla-drinkers@stacken.kth.se>; Thu, 22 Apr 1999 20:13:56 +0200 (MET DST)
Received: from robert.e.kth.se (robert.e.kth.se [130.237.48.106])
	by elixir.e.kth.se (8.9.2/8.9.2) with ESMTP id UAA27206;
	Thu, 22 Apr 1999 20:13:54 +0200 (MET DST)
Received: (from lha@localhost)
	by robert.e.kth.se (8.9.2/8.9.2) id UAA12196;
	Thu, 22 Apr 1999 20:13:54 +0200 (MET DST)
From: Love <lha@stacken.kth.se>
To: "Mattias Engdegård" <f91-men@nada.kth.se>
Cc: arla-drinkers@stacken.kth.se
Subject: Re: severe cache coherency problem
References: <199904212305.BAA15016@orion.nada.kth.se>
Mime-Version: 1.0 (generated by tm-edit 7.106)
Content-Type: text/plain; charset=US-ASCII
Date: 22 Apr 1999 20:13:53 +0200
In-Reply-To: "Mattias Engdegård"'s message of Thu, 22 Apr 1999 01:05:52 +0200 (MET DST)
Message-ID: <amzp40v7z2.fsf@robert.e.kth.se>
Lines: 42
X-Mailer: Gnus v5.5/Emacs 20.2
Sender: owner-arla-drinkers@stacken.kth.se
Precedence: bulk

"Mattias Engdegård" <f91-men@nada.kth.se> writes:

> I'm creating 2 files, a and b, on host X (Solaris, transarc AFS client).
> They are visible and readable from host Y (Linux 2.2.6, arla 0.23, libc5.4.46).
> When X removes file a, arlad on Y says:
> 
>   Thu Apr 22 00:45:31 1999: arlad: callback (130.237.42.231)
>   Thu Apr 22 00:45:31 1999: arlad: -1: (536880467, 23, 517742)
>   Thu Apr 22 00:45:31 1999: arlad: callback for non-existing file (-1, 536880467, 23, 517742)
> 
> but ls on Y can see both. When X removes file b, arlad reports nothing
> (running with debug=almost-all).

As the -1 hints arlad have seams to miss the address in the connection
cache, and thus fails to find its cell. Could you start arlad with -n
and when this happens do `fs venuslog', this should output a dump of the
state of all parts of arla. Please verify that arla doesn't have
130.237.42.231, in the connection cache.

There should probably be some refcount on the connection from the
FCacheEntry too.

> Trying to open the files produces
> 
>   Thu Apr 22 00:46:46 1999: arlad: worker 0: processing
>   Thu Apr 22 00:46:46 1999: arlad: Rec message: opcode = 12 (open), size = 40
>   Thu Apr 22 00:46:46 1999: arlad: read_data
>   Thu Apr 22 00:46:46 1999: arlad: Error reading length: Network dropped connection because of reset
>   Thu Apr 22 00:46:46 1999: arlad: multi-sending wakeup: seq = 28, error = 102
>   Thu Apr 22 00:46:46 1999: arlad: worker 0: done
>   Thu Apr 22 00:46:46 1999: arlad: worker 0 waiting
>   Thu Apr 22 00:46:46 1999: arlad: worker 0: processing
>   Thu Apr 22 00:46:46 1999: arlad: Rec message: opcode = 10 (inactivenode), size = 32
>   Thu Apr 22 00:46:46 1999: arlad: worker 0: done
>   Thu Apr 22 00:46:46 1999: arlad: worker 0 waiting
>   cat: b: Network dropped connection because of reset

This error should be handled better, but will never occur if the
callback-code works.

Love


home help back first fref pref prev next nref lref last post