[7241] in testers

home help back first fref pref prev next nref lref last post

Re: owl losing subs during AFS restart

daemon@ATHENA.MIT.EDU (Mitchell E Berger)
Sun Jul 10 12:50:13 2005

Message-Id: <200507101650.j6AGnx8q009853@byte-me.mit.edu>
To: Kevin Chen <kchen@MIT.EDU>
cc: testers@MIT.EDU, bug-owl@ktools.org
In-Reply-To: Your message of "Sun, 10 Jul 2005 11:40:15 EDT."
             <42D1415F.7040905@mit.edu> 
Date: Sun, 10 Jul 2005 12:49:59 -0400
From: Mitchell E Berger <mitchb@MIT.EDU>

I've seen similar behavior in a very slightly different situation - if
you have an obscene number of zephyrs in your owl buffer (say on the order
of a quarter million) and suspend to get new tickets and tokens, when you
bring owl back to the foreground, it will take a very long time (on the order
of 15-20 minutes) for owl to finish paging in and become quiescent again.
On occasion, though not always, I've noticed that my subs are gone after this
happens.  The amount of time this takes is very close to the amount of time
owl would be unresponsive while trying to write into AFS during the weekly
restart, so I suspect you're seeing exactly the same problem.

My bloated owl session is on a Solaris Athena 9.3 machine, so I'm fairly
certain the problem has nothing to do with either Linux or 9.4, and I'd
been working under the assumption that this wasn't owl's fault either because
I believe it's zhm's job to maintain subscriptions for you, and it doesn't
seem completely silly for zhm to assume your client has died and drop its
subs if the client has been completely unresponsive for nearly 20 minutes.
I am curious if it actually has logic like that, though, and haven't looked
through the source yet.

Mitch

> Kevin Chen wrote:
> > I did not experience this while running Linux-Athena 9.3, and the other
> > owl session I have on a Solaris-Athena 9.3 machine did not experience
> > this problem.
> 
> It seems likely that this might be caused by the "classlogging" variable
> that I have set, which logs zephyrs to my home directory.  On
> scyther.mit.edu, this variable was on, while on the Solaris-Athena 9.3
> machine, this variable was not set.  However, even though I've had this
> option set in the past under Linux-Athena 9.3, I didn't have this
> problem before.
> 
> -- 
> Kevin Chen
> http://www.sneswhiz.com/

home help back first fref pref prev next nref lref last post