[155498] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

Re: Does anyone use anycast DHCP service?

daemon@ATHENA.MIT.EDU (Leo Bicknell)
Mon Aug 13 10:11:45 2012

Date: Mon, 13 Aug 2012 07:10:40 -0700
From: Leo Bicknell <bicknell@ufp.org>
To: nanog@nanog.org
Mail-Followup-To: nanog@nanog.org
In-Reply-To: <CAB6yvaG5xQiOAyFs7ptS-KPhbKQ3aET28JLoSkkwkLvsmXxmuQ@mail.gmail.com>
Errors-To: nanog-bounces+nanog.discuss=bloom-picayune.mit.edu@nanog.org


--HcAYCG3uE/tztfnV
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

In a message written on Mon, Aug 13, 2012 at 08:54:09AM -0500, Ryan Malayte=
r wrote:
> 1) No third-party "witness" service for the cluster, making
> split-brain scenarios a very real possibility.

The ISC implementation is designed to continue to work with a "split
brain".  I believe the Microsoft solution is as well, but I know
less about it.  There's no need to detect if the redundant pair
can't communicate as things continue to work.  (With some caveats,
see below.)

> 2) Multi-master databases are quite challenging in practice. This one
> appears to rely on timestamps from the system clock for conflict
> detection, which has been shown to be unreliable time and again in the
> application space.

You are incorrect.  The ISC implementation divides the free addresses
between the two servers.  The client will only interact with the
first to respond (literally, no timestamps involved).  Clients
talking to each half of a split brain can continue to receive
addresses from the shared range, no timestamps are needed to resolve
conflicts, because the pool was split prior to the loss of
server-to-server communication.

There is a down-side to this design, in that if half the brain goes
away half of the free addresses become unusable with it until it
resynchronizes.  This can be mitigated by oversizing the pools.

> 3) There are single points of failure. You've traded hardware as a
> single point of failure for "bug-free implementation of clustering
> code on both DHCP servers" as a single point of failure. In general,
> software is far less reliable than hardware.

Fair enough.

However I suspect most folks are not protecting against hardware
or software failures, but rather circuit failures between the client
and the DHCP servers.

I've actually never been a huge fan of large, centralized DHCP
servers, clustered or otherwise.  Too many eggs in one basket.  I
see how it may make administration a bit easier, but it comes at
the cost of a lot of resiliancy.  Push them out to the edge, make
each one responsible for a local network or two.  Impact of an
outage is much lower.  If the router provides DHCP, the failure
modes work together, router goes down so does the DHCP server.

I think a lot of organizations only worry about the redundancy of
DHCP servers because the entire company is dependant on one server
(or cluster), and the rest of their infrastructure is largely
non-redundant.

--=20
       Leo Bicknell - bicknell@ufp.org - CCIE 3440
        PGP keys at http://www.ufp.org/~bicknell/

--HcAYCG3uE/tztfnV
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (FreeBSD)

iQIVAwUBUCkK4LN3O8aJIdTMAQIAQA/9FXwOlsEg8HYyBzt+WfNbsrhiIHxczcd4
5FpJ7pxsJO5vxcGGXmPac1KbeK+UJSJA+BkEDrAhw9g38wpAsbPbAWWCFEfFRKjK
7x5RkcHlVULUqnXJ90GTT6AxKG+BdQ/47cDQ2Dxd2xYO/xBYthM9bO5XGjMGPaNa
neHYPb4lMumKMPbV3KrccayRinC9z5p/nn5CBjeURRlQMEZv70V/U8L3W44kKA3h
epSU/5mkmf/IwV0x6DrlUzjAIQPPAz2HMFj+YK77c2E6o63W8lpZ2Lsv2B7P25Nd
DGX6gwAq+eb09d9MLgz8rlOl5z8N5USAgTlxZ4APa1FROhhCWIeKQTd8ewfHOJmM
g52R+Xgblz9HAown9dLU3g6r6VNNqz+FlpwgZxFNIUYm7FMVa1QjxbBxfodP9i0X
wBtzo/t58vo4diSq5rrC6Ia+EjMrw9V76VNdDKz7reC/+Od+lAXVRENIaJRxwHs5
SeeZCxemzsqGrzyeJzKviFeno30V94FzgnlURkhetKNwuz9/JL883HI4Orcx8mXJ
w2sTQlW1fu7z2bJ0lAYCXMLCEIHjLcEmOuf91DtdYjXmYRsJGYl8SrIpVGI/CqTO
Jqx4w5SsFHK3d8COvxKP4dlTAq0pGOpRYuYbOEYWyhBKuuA9AmN7zYRjRnkFb/7I
EQTdBbSqQnI=
=HOeo
-----END PGP SIGNATURE-----

--HcAYCG3uE/tztfnV--


home help back first fref pref prev next nref lref last post