[180349] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

Re: BGP Multihoming 2 providers full or partial?

daemon@ATHENA.MIT.EDU (Maqbool Hashim)
Mon Jun 1 12:31:03 2015

X-Original-To: nanog@nanog.org
From: Maqbool Hashim <maqbool@madbull.info>
To: Baldur Norddahl <baldur.norddahl@gmail.com>, "nanog@nanog.org"
 <nanog@nanog.org>
Date: Mon, 1 Jun 2015 16:28:55 +0000
In-Reply-To: <CAPkb-7B-_Q4cPYDvDc6K-bJEAOBSjWkMa6FDX2KD4Tt9ZaTvPg@mail.gmail.com>
Errors-To: nanog-bounces+nanog.discuss=bloom-picayune.mit.edu@nanog.org

First off thanks to everyone that responded to my original post, very instr=
uctive and informational replies along with a good view of different perspe=
ctives.

Baldur, you pointed out that for ingress it's exactly the same to take part=
ials, we are only affected on outbound and we can achieve a large part of t=
he redundancy for outbound also.  Someone else pointed out that partitions =
of the Internet view from our two providers are often lasting minutes rathe=
r than hours.  Given this input I really lean towards Baldur's statement of=
 we can probably spend the money better elsewhere.

One point I will try and make internally is "Do we care about all of the In=
ternet all of the time?", note we are not an ISP.  Basically if some part o=
f the Internet in is unreachable for a "short" period will we even notice i=
t?  Always if it is one of our remote sites, but of course we can mitigate =
that by making those part of the partials that we take from both of our pro=
viders. =20

By taking full routes I can only see us protecting the view of the whole In=
ternet our internal web browsing clients, after all if a partition to a "bu=
sy" part of the Internet happens we will notice it straight away (Google et=
c.), but if it is someone's iTunes server on the end of some small DSL prov=
ider- do we care?

One thing I would rather not do which is manage static routes on the BGP ro=
uters seems counter intuitive on the face of it.

________________________________________
From: NANOG <nanog-bounces@nanog.org> on behalf of Baldur Norddahl <baldur.=
norddahl@gmail.com>
Sent: 01 June 2015 16:49
To: nanog@nanog.org
Subject: Re: BGP Multihoming 2 providers full or partial?

On 1 June 2015 at 15:29, Blake Hudson <blake@ispn.net> wrote:

> Something to point out: Sometimes the device you connect to is up, but ha=
s
> no reachability to the rest of the world. Using static routes is.. well..
> static. There are a few cases (such as the one mentioned) where a static
> route can be somewhat dynamic. Another case is when the static route next
> hop does not respond to ARP requests or some machines have the ability to
> perform triggered actions on some sort of event/test. But why bother with
> BGP if you're just going to override its decisions by using static routes=
?
>
> As another commenter mentioned, using anything less than a full table is =
a
> compromise. If one wants the redundancy in the case of an upstream ISP
> outage, take full routes. If one wants the traffic engineering flexibilit=
y,
> take full routes and use a BGP knob like route maps to modify existing
> prefixes rather than make up your own. A default route of last resort is
> fine; Overriding BGP through static routes degrades the utility of BGP.
>

Thanks for pointing this out. However I would like to argue whether this is
a big drawback or not.

If the original poster had infinite money and infinite resources there
would be no question to ask. Just get the most expensive router out there
and get full tables.

So given that the money could be spent on other things, that might be more
helpful for his company, is it good value to invest in new routers? I
believe every company and NOC teams needs to decide this for themselves. I
do however feel this is often a rushed decision because people have an idea
that anything less than full tables is not good enough and that you are not
a real ISP if you do not have full tables etc.

It is true that your static routes could end up pointing at a half dead
router, that still keeps the link up. But it is also perfectly possible for
a router to keep advertising routes, that it really can't forward traffic
to or where there are service problems so servere that it amounts to the
same (excessive packet loss etc). This is supposed to be rare for a good
quality transit provider and the remedy is the same (manually take the link
down).

We got our big routers and full tables early on. With perfect 20/20
hindsight I am not sure I would spend the money that way if I had to do it
over.

All I am saying is that you can get most of the value with partial tables.
You get 100% of it with ingress traffic and you can move a very large
fraction of your egress exactly the same. Your redundancy might not be
equal, but it will not be entirely bad.

Regards,

Baldur

home help back first fref pref prev next nref lref last post