[35994] in Kerberos
Re: The mysterious death of kprop when running incremental propagtion
daemon@ATHENA.MIT.EDU (Jeremy Hunt)
Wed Apr 2 22:31:55 2014
From: Jeremy Hunt <jeremyh@optimation.com.au>
To: William Clark <majorgearhead@gmail.com>
Date: Thu, 03 Apr 2014 10:12:32 +1100
Message-ID: <533c9960.5ad3.4203d940.664d0ca3@optimation.com.au>
MIME-Version: 1.0
Cc: kerberos@mit.edu
Content-Type: text/plain; charset="utf-8"
Errors-To: kerberos-bounces@mit.edu
Content-Transfer-Encoding: 8bit
Hi WIlliam,
Apologies for not responding sooner.
You have 9 kprop's pinging your server. They chat to the kadmind
service which will propagate individual entries back to them and
occasionally do a full propagation. During the full propagation it
will do a dump with kdb5_util itself. I am uncertain how it handles 9
requests for a full propagation. On top of this you say you do a
kdb5_util dump every 10 minutes.
This probably does strain the locking mechanism, especially as it is
all asynchronous, you have no control over when these run. If it is a
locking problem then it might be as much due to the kprop processes
tripping over each other as the kdb5_util dump process. It might even
be more likely to be the cause.
Greg is correct that the propagation code has improved considerably
with the later versions.
I have two suggestions.
1. You do not have to use the Centos kerberos package. You can
download and build the latest MIT kerberos one and use that, just turn
off or deinstall the Centos packages. You do have to watch for
security alerts yourself and patch and rebuild the code yourself. This
is not too bad as kerberos is pretty robust, and there are not a lot
of these. You would also need to test that this solved your problem,
do you have a test facility with 9 kprop processes from different
machines?
2. Go back to the old cron doing full propagation in a controlled
manner, so they don't trip over each other. If it takes 20 seconds to
dump the database, it probably doesn't take too much longer to
propagate the full database, time this propagation time and code
things appropriately. So dump, do 9 propagations, and every 10 minutes
save your dump. Be careful though, for instance if it takes about 20
seconds to propagate the database each time, then for 9 propagations
your updates fall back to 3 minute turnarounds.
Good Luck,
Jeremy
>
> --- Original message ---
> Subject: Re: The mysterious death of kprop when running incremental
> propagtion
> From: William Clark <majorgearhead@gmail.com>
> To: Greg Hudson <ghudson@mit.edu>
> Cc: <kerberos@mit.edu>
> Date: Thursday, 03/04/2014 9:07 AM
>
> I am in a rock and a hard place. I must use CentOS upstream packages,
> however their upstream latest is 10.10.3. I see one of the bugs fixed
> was an issue where a full propagation doesn’t complete all the way
> but kprop thinks its fine. I think this may be what I am hitting.
> Wondering if there is any tuning I could do to mitigate this while I
> wait for later packages. My only other option is to go back to
> traditional propagation.
>
> Right now my slaves have this config:
> iprop_master_ulogsize = 1000
> iprop_slave_poll = 2m
>
> Additionally like I shared before, I am running the following every 10
> mins '/usr/sbin/kdb5_util dump'
>
> I wonder if upping the ulog size would allow more time before a full
> prop is called for those times my server is ultra busy. My thinking
> is this may be happening during full prop which happens because the
> server was busy for a period of time.
>
> Any thoughts would be helpful.
>
>
> William Clark
>
>
>
> On Mar 31, 2014, at 8:34 PM, Greg Hudson <ghudson@MIT.EDU> wrote:
>
>>
>> On 03/31/2014 05:44 PM, William Clark wrote:
>>>
>>> Running the following from CentOS upstream:
>>> krb5-server-1.10.3-10.el6_4.6.x86_64
>>>
>>> I am not adverse to going with the latest stable MIT version if it
>>> will
>>> help in this.
>>
>> I think testing 1.12.1 would be worthwhile. I don't know of any
>> specific bugs in 1.10 which could lead to a SIGABRT, but there are
>> numerous iprop and locking improvements which went into 1.11 and 1.12
>> but were too invasive to backport to 1.10.
>
> ________________________________________________
> Kerberos mailing list Kerberos@mit.edu
> https://mailman.mit.edu/mailman/listinfo/kerberos
________________________________________________
Kerberos mailing list Kerberos@mit.edu
https://mailman.mit.edu/mailman/listinfo/kerberos