[1013] in arla-drinkers

home help back first fref pref prev next nref lref last post

Re: Error 'Software caused connection abort' in arla 0.26 and 0.25

daemon@ATHENA.MIT.EDU (Love)
Sat Jul 24 14:02:15 1999

From owner-arla-drinkers@stacken.kth.se Sat Jul 24 18:02:15 1999
Return-Path: <owner-arla-drinkers@stacken.kth.se>
Delivered-To: arla-drinkers-mtg@bloom-picayune.mit.edu
Received: (qmail 8740 invoked from network); 24 Jul 1999 18:02:14 -0000
Received: from unknown (HELO sundance.stacken.kth.se) (130.237.234.41)
  by bloom-picayune.mit.edu with SMTP; 24 Jul 1999 18:02:14 -0000
Received: (from majordom@localhost)
	by sundance.stacken.kth.se (8.8.8/8.8.8) id TAA15501
	for arla-drinkers-list; Sat, 24 Jul 1999 19:57:02 +0200 (MET DST)
Received: from elixir.e.kth.se (elixir.e.kth.se [130.237.48.5])
	by sundance.stacken.kth.se (8.8.8/8.8.8) with ESMTP id TAA15497;
	Sat, 24 Jul 1999 19:56:58 +0200 (MET DST)
Received: from anchor.s3.kth.se (anchor.s3.kth.se [130.237.43.59])
	by elixir.e.kth.se (8.9.3/8.9.3) with ESMTP id TAA15506;
	Sat, 24 Jul 1999 19:56:56 +0200 (MET DST)
Received: (from lha@localhost)
	by anchor.s3.kth.se (8.9.3/8.9.3) id TAA00324;
	Sat, 24 Jul 1999 19:56:54 +0200 (MET DST)
From: Love <lha@stacken.kth.se>
To: Jeffrey Hutzelman <jhutz@cmu.edu>
Cc: Assar Westerlund <assar@stacken.kth.se>,
        Dr A V Le Blanc <LeBlanc@mcc.ac.uk>, arla-drinkers@stacken.kth.se
Subject: Re: Error 'Software caused connection abort' in arla 0.26 and 0.25
References: <Pine.SOL.3.95L.990724115513.3210D-100000@afstest-1.fac.cs.cmu.edu>
Mime-Version: 1.0 (generated by tm-edit 7.106)
Content-Type: text/plain; charset=US-ASCII
Date: 24 Jul 1999 19:56:54 +0200
In-Reply-To: Jeffrey Hutzelman's message of Sat, 24 Jul 1999 12:50:41 -0400 (EDT)
Message-ID: <am9086aq3d.fsf@anchor.s3.kth.se>
Lines: 29
X-Mailer: Gnus v5.5/Emacs 20.2
Sender: owner-arla-drinkers@stacken.kth.se
Precedence: bulk

Jeffrey Hutzelman <jhutz@cmu.edu> writes:

> Indeed, in arlad/fcache.c:try_next_fs(), the handling of VMOVED and VNOVOL
> changed between 0.22 and 0.25 (versions I happen to have on hand).
> Previously, if a fileserver returned VMOVED or VNOVOL, arla would try the
> next fileserver, if any.  Now, it gives up on the call immediately, but
> then updates its volume cache and tries again.  The new behaviour is
> correct for VMOVED, but IMNSHO try_next_fs() should still return TRUE for
> VNOVOL, since we could be talking about an RO site which doesn't have an
> online copy of the volume, and I believe the current code will retry such
> a site forever.

The thing is that I have seen VNOVOL directly after a volume moved. I think
the correct way of handle it is to choose the next volume if it exist one
and try to avoid talk to servers with known bad volumes.

I know that the loop exist, I haven't just got around to fix it.
 
> In any case, I don't think that's your problem -- if that code were broken
> _and_ leaked an error code, it would likely leak ARLA_VNOVOL (4103), not
> VNOVOL (103).  I believe the real problem in this case is that the error
> code translation is not happening, and so the special handling for VNOVOL
> is not happening.  I'll forward more details when I'm more sure of what's
> going on.

We miss the conversion on a couple of place that is fixed (hopfully) in the
current code (mostly rx_Write).

Love

home help back first fref pref prev next nref lref last post