[47489] in SIPB-AFS-requests
Re: sipb cell AFS server tuning
daemon@ATHENA.MIT.EDU (Jonathon Weiss)
Tue Jun 18 10:44:11 2013
From jweiss@MIT.EDU Tue Jun 18 14:44:11 2013
Return-Path: <jweiss@MIT.EDU>
Delivered-To: sipb-afsreq-mtg@CHARON.mit.edu
Received: (qmail 8008 invoked from network); 18 Jun 2013 14:44:11 -0000
Received: from mailhub-auth-2.mit.edu (18.7.62.36)
by charon.mit.edu with SMTP; 18 Jun 2013 14:44:11 -0000
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11])
by mailhub-auth-2.mit.edu (8.13.8/8.9.2) with ESMTP id r5IEi3W1030593;
Tue, 18 Jun 2013 10:44:03 -0400
Received: from localhost (the-other-woman.mit.edu [18.94.1.219])
(authenticated bits=0)
(User authenticated as jweiss@ATHENA.MIT.EDU)
by outgoing.mit.edu (8.13.8/8.12.4) with ESMTP id r5IEi2IU010732;
Tue, 18 Jun 2013 10:44:02 -0400
Date: Tue, 18 Jun 2013 10:44:01 -0400 (EDT)
From: Jonathon Weiss <jweiss@MIT.EDU>
To: Benjamin Kaduk <kaduk@mit.edu>
cc: sipb-afsreq@mit.edu
Subject: Re: sipb cell AFS server tuning
In-Reply-To: <alpine.GSO.1.10.1306180959300.26275@multics.mit.edu>
Message-ID: <alpine.DEB.2.02.1306181041100.31827@the-other-woman.mit.edu>
References: <201306181328.r5IDSjpd012090@outgoing.mit.edu> <alpine.GSO.1.10.1306180959300.26275@multics.mit.edu>
User-Agent: Alpine 2.02 (DEB 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
On Tue, 18 Jun 2013, Benjamin Kaduk wrote:
> On Tue, 18 Jun 2013, Jonathon Weiss wrote:
>
>> I just noticed a few things about the sipb AFS cell.
>>
>> * it looks like the fileserver is using the arguments that ops
>> selected for the athena cell years ago.
>>
>> * ops recently changed those to increas the number of threads
>> ("-p 36" --> "-p 64") and the number of callback slots ("" -->
>> "-cb 256000"), and there are indications that the sipb cell
>> would benefit from at least the second of these (and possibly
>> the first, but that's less clear).
>>
>> * the sipb cell still restarts weekly. The athena cell hasn't
>> done this in a long time.
>>
>> I'd like to see us update both fileserver arguments, and disable the
>> weekly restart. That said, I don't want to interfere with the server
>> upgrades Mitch and Ben are working on. Do folks agree in concept with
>> these changes, and if so, have any thoughts on coordinateion? (I could
>> certainly stageteh arg changes in BosConfig let one last weekly restart
>> happen, and then disable that.)
>
> Certainly while we have ~all data on ra, there have been calls waiting for a
> thread, so more threads would be expected to help. It's less clear whether
> more callback slots are actually needed -- there are some log messages
> getting entered in conjunction with callback breaks, but I think they are
> more of the "remote host does not exist anymore or is firewalled off" sort
> than "we ran out of slots".
>
> I would like to defer disabling weekly restarts (and really, all the changes)
> until the software upgrade is finished, though.
http://www.openafs.org/pages/newsletter/newsletter-2013-03-volume004-issue05.html
suggests running "xstat_fs_test $server -collID 3 -onceonly" and
seeing if any of the GSS values are non-zero. If so, the server has
run out of callbacks.
I have no problem waiting until you're done rototilling the software,
if that's what you'd prefer. I did want to get them on the queue though.
Jonathon