[115024] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

Re: Facility wide DR/Continuity

daemon@ATHENA.MIT.EDU (gb10hkzo-nanog@yahoo.co.uk)
Wed Jun 3 10:55:15 2009

Date: Wed, 3 Jun 2009 07:53:30 -0700 (PDT)
From: gb10hkzo-nanog@yahoo.co.uk
To: Jim Wise <jwise@draga.com>
In-Reply-To: <87r5y1750v.fsf@gondolin.draga.com>
Cc: nanog@nanog.org
Errors-To: nanog-bounces+nanog.discuss=bloom-picayune.mit.edu@nanog.org


As with all things, there's no "right answer" ..... a lot of it depends on =
three things :=0A=0A- what you are hoping to achieve=0A- what your budget i=
s=0A- what you have at your disposal in terms of numbers of qualified staff=
 available to both implement and support the chosen solution=0A=0AThat's th=
e main business level factors.  From a technical level, two key factors (al=
though, of course, there are many others to consider) are :=0A=0A- whether =
you are after an active/active or active/passive solution=0A- what the unde=
rlying application(s) are (e.g. you might have other options such as anycas=
t with DNS)=0A=0A=0AAnyway, there's a lot to consider.  And despite all the=
 expertise on Nanog, I would still suggest the original poster does their f=
air share of their own homework. :)=0A=0A=0A=0A=0A=0A=0A----- Original Mess=
age ----=0AFrom: Jim Wise <jwise@draga.com>=0ATo: gb10hkzo-nanog@yahoo.co.u=
k=0ACc: nanog@nanog.org=0ASent: Wednesday, 3 June, 2009 15:42:24=0ASubject:=
 Re: Facility wide DR/Continuity=0A=0Agb10hkzo-nanog@yahoo.co.uk writes:=0A=
=0A> On the subject of DNS GSLB, there's a fairly well known article on the=
=0A> subject that anyone considering implementing it should read at least=
=0A> once.... :)=0A>=0A> http://www.tenereillo.com/GSLBPageOfShame.htm=0A> =
and part 2=0A> http://www.tenereillo.com/GSLBPageOfShameII.htm=0A>=0A> Yes =
it was written in 2004.  But all the "food for thought" that it=0A> provide=
s is still very much applicable today.=0A=0AOne thing I've noticed about th=
is paper in the past that kind of bugs me=0Ais that in arguing that multipl=
e A records are a better solution than a=0Asingle GSLB-managed A record, th=
e paper assumes that browsers and other=0Acommon internet clients will actu=
ally cache multiple A records, and fail=0Abetween them if the earlier A rec=
ords fail.  The (first) of the two=0Apages explicitly touts this as a high =
availability solution.=0A=0AHowever, I haven't observed this behavior from =
browsers, media players,=0Aand similar programs `in the wild' -- as far as =
I've been able to tell,=0Amost client software picks an A record from those=
 returned (possibly,=0Abut not usually skipping those found to be unreachab=
le), and then holds=0Aonto that choice of IP address until the record times=
 out of cache, and=0Aa new request is made.=0A=0AHave I been unlucky in my =
observations?  Are there client programs which=0Ado failover between multip=
le A records returned for a single name --=0Apresumably sticking with one I=
P for session-affinity purposes until a=0Afailure is detected?=0A=0AIf clie=
nts do not behave this way, then the paper's observations about=0AGSLB for =
HA purposes don't seem to hold -- though in my limited=0Aexperience the pap=
er's other point (that geographic dispatch is Hard)=0Aseems much more accur=
ate (making GSLB a better HA solution than it is a=0Aload-sharing solution,=
 again, at least in my experience).=0A=0AOr am I missing something?=0A=0A--=
 =0A                Jim Wise=0A                jwise@draga.com=0A=0A=0A=0A =
     


home help back first fref pref prev next nref lref last post