[161675] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

Re: Is multihoming hard? [was: DNS amplification]

daemon@ATHENA.MIT.EDU (Owen DeLong)
Sun Mar 24 22:37:01 2013

From: Owen DeLong <owen@delong.com>
In-Reply-To: <CA+TcGd-1L3HxkdtE6=E6DqjnkD2N9u+_v7wtVuJkrWTDCf5-Tg@mail.gmail.com>
Date: Sun, 24 Mar 2013 19:31:38 -0700
To: Kyle Creyts <kyle.creyts@gmail.com>
Cc: "nanog@nanog.org" <nanog@nanog.org>
Errors-To: nanog-bounces+nanog.discuss=bloom-picayune.mit.edu@nanog.org

I assume those people will not bother with any attempt to multihome in =
any form.
They are not, therefore, part of what is being discussed here.

Owen

On Mar 23, 2013, at 19:47 , Kyle Creyts <kyle.creyts@gmail.com> wrote:

> You do realize that there are quite a few people (home broadband =
subscribers?) who just "go do something else" when their internet goes =
down, right?
>=20
> There are people who don't understand the difference between "a site =
being slow" and packet-loss. For many of these people, losing internet =
service carries zero business impact, and relatively little life impact; =
they might even realize they have better things to do than watch cat =
videos or scroll through endless social media feeds.
>=20
> Will they really demand ubiquitous, unabridged connectivity?
>=20
> When?
>=20
> On Mar 23, 2013 12:58 PM, "Owen DeLong" <owen@delong.com> wrote:
> >
> >
> > On Mar 23, 2013, at 12:12 , Jimmy Hess <mysidia@gmail.com> wrote:
> >
> > > On 3/23/13, Owen DeLong <owen@delong.com> wrote:
> > >> A reliable cost-effective means for FTL signaling is a hard =
problem without
> > >> a known solution.
> > >
> > > Faster than light signalling is not merely a hard problem.
> > > Special relativity doesn't provide that information may travel =
faster
> > > than the maximum
> > > speed C.    If you want to signal faster than light, then slow =
down the light.
> > >
> > >> An idiot-proof simple BGP configuration is a well known solution. =
Automating
> > >> it would be relatively simple if there were the will to do so.
> > >
> > > Logistical problems...  if it's a multihomed connection, which of =
the
> > > two or three providers manages it,  and gets to blame the other
> > > provider(s) when anything goes wrong: or are you gonna rely on the
> > > customer to manage it?
> > >
> >
> > The box could (pretty easily) be built with a "Primary" and =
"Secondary" port.
> >
> > The cable plugged into the primary port would go to the ISP that =
sets the
> > configuration. The cable plugged into the other port would go to an =
ISP
> > expected to accept the announcements of the prefix provided by the =
ISP
> > on the primary port.
> >
> > BFD could be used to illuminate a tri-color LED on the box for each =
port,
> > which would be green if BFD state is good and red if BFD state is =
bad.
> >
> > At that point, whichever one is red gets the blame. If they're both =
green,
> > then traffic is going via the primary and the primary gets the =
blame.
> >
> > If you absolutely have to troubleshoot which provider is broken, =
then
> > start by unplugging the secondary. If it doesn't start working in 5 =
minutes,
> > then clearly there's a problem with the primary regardless of what =
else
> > is happening.
> >
> > Lather, rinse, repeat for the secondary.
> >
> > > Someone might be able to make a protocol that lets this happen, =
which
> > > would need to detect on a per-route basis any =
performance/connectivity
> > > issues, but I would say it's not any known implementation of BGP.
> >
> > A few additional options to DHCP could actually cover it from the =
primary
> > perspective.
> >
> > For the secondary provider, it's a little more complicated, but =
could be
> > mostly automated so long as the customer identifies the primary =
provider
> > and/or provides an LOA for the authorized prefix from the primary to
> > the secondary.
> >
> > The only complexity in the secondary case is properly filtering the =
announcement
> > of the prefix assigned by the primary.
> >
> > >> 1.   ISPs are actually motivated to prevent customer mobility, =
not enable it.
> > >
> > >> 2.   ISPs are motivated to reduce, not increase the number of =
multi-homed
> > >>      sites occupying slots in routing tables.
> > >
> > >    This is not some insignificant thing.   The ISPs have to =
maintain
> > > routing tables
> > >    as well;  ultimately the ISP's customers are in bad shape, if =
too many slots
> > >    are consumed.
> > >
> >
> > I never said it was insignificant. I said that solving the =
multihoming problem
> > in this manner was trivial if there was will to do so. I also said =
that the above
> > were contributing factors in the lack of will to do so.
> >
> > > How about
> > >   3.  Increased troubleshooting complexity when there are =
potential
> > > issues or complaints.
> > >
> >
> > I do not buy that it is harder to troubleshoot a basic BGP =
configuration
> > than a multi-carrier NAT-based solution that goes woefully awry.
> >
> > I'm sorry, I've done the troubleshooting on both scenarios and I =
have
> > to say that if you think NAT makes this easier, you live in a =
different
> > world than I do.
> >
> > > The concept of a "fool proof"  BGP configuration is clearly a new =
sort of myth.
> >
> > Not really.
> >
> > Customer router accepts default from primary and secondary =
providers.
> > So long as default remains, primary is preferred. If primary default =
goes
> > away, secondary is preferred.
> >
> > Customer box gets prefix (via DHCP-PD or static config or whatever
> > either from primary or from RIR). Advertises prefix to both primary
> > and secondary.
> >
> > All configuration of the BGP sessions is automated within the box
> > other than static configuration of customer prefix (if static is =
desired).
> >
> > Primary/Secondary choice is made by plugging providers into the
> > Primary or Secondary port on the box.
> >
> > > The idea that the protocol on its own, with a very basic config, =
does
> > > not ever require
> > > any additional attention,  to achieve expected results;  where
> > > expected results include isolation from any faults with the path =
from
> > > one of of the user's two, three, or four providers,  and  =
balancing
> > > for optimal throughput and best latency/loss to every destination.
> >
> > I have installed these configurations at customer sites for several =
of
> > my consulting clients that wanted to multihome their SMBs.
> >
> > Some of them have been running for more than 8 years without a
> > single issue.
> >
> > For all of the above requirements, no. You can't do that with the =
most
> > advanced manual BGP configurations today.
> >
> > However, if we reduce it to:
> >
> > 1.      The internet connection stays up so long as one of the two
> >         providers is up.
> >
> > 2.      Traffic prefers the primary provider so long as the primary =
provider
> >         is up.
> >
> > 3.      My addressing remains stable so long as I remain connected =
to
> >         the primary provider (or if I use RIR based addressing, =
longer).
> >
> > Then what I have proposed actually is achievable, does work, and
> > does actually meet the needs of 99+% of organizations that wish to
> > multihome.
> >
> > > BGP multihoming doesn't  prevent users from having issues because:
> > >
> > >      o Connectivity issues that are a responsibility of one of =
their provider's
> > >         That they might have expected multihoming to protect them =
against
> > >          (latency, packet loss).
> >
> > Correct. However, this is true of ANY multihoming solution. The =
dual-
> > provider NAT solution certainly does NOT improve this.
> >
> > >      o very Poor performance of one of their links;  or poor
> > > performance of one of their
> > >         links to their favorite destination
> >
> > See above.
> >
> > >      o Asymmetric paths;  which means that when latency or loss is =
poor,
> > >         the customer doesn't necessarily know which provider to =
blame,
> > >         or if both are at fault,  and  the providers can spend a =
lot of time
> > >         blaming each other.
> >
> > See above.
> >
> > > These are all solvable problems,   but at cost, and therefore not =
for
> > > massmarket lowest cost ISP service.
> >
> > My point is that the automated simple BGP solution I propose can =
provide
> > a better customer experience than the currently popular NAT-based
> > multihoming with simpler troubleshooting and lower costs.
> >
> > > It's not as if they can have
> > >    "Hello, DSL technical support...  did you try shutting off your
> > > other peers and retesting'?"
> >
> > ROFL.
> >
> > > The average end user won't have a clue -- they will need one of =
the
> > > providers, or someone else to be managing that for them,  and
> > > understand  how each provider is connected.
> >
> > Again, you're setting a much higher goal than I was.
> >
> > My goal was to do something better than what is currently being =
done.
> > (Connect a router to two providers and use NAT to choose between =
them).
> >
> > > I don't see large ISPs  training up their support reps for  DSL
> > > $60/month services, to handle BGP troubleshooting, and multihoming
> > > management/repair.
> >
> > But they already get stuck with this in the current NAT-based =
solution which
> > is even harder to troubleshoot and creates even more problems.
> >
> > Owen
> >
> >


home help back first fref pref prev next nref lref last post