[137899] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

Re: BGP Failover Question

daemon@ATHENA.MIT.EDU (Owen DeLong)
Tue Feb 22 14:24:33 2011

From: Owen DeLong <owen@delong.com>
In-Reply-To: <AANLkTikWegXv1WVoihWxupsHTe4OWDNYRnaKniKzmJVt@mail.gmail.com>
Date: Tue, 22 Feb 2011 11:20:12 -0800
To: Hammer <bhmccie@gmail.com>
Cc: nanog@nanog.org
Errors-To: nanog-bounces+nanog.discuss=bloom-picayune.mit.edu@nanog.org


On Feb 22, 2011, at 10:52 AM, Hammer wrote:

> I agree. But swapping providers is not the default answer in some =
environments. I work in an enterprise with multiple GE circuits from =
multiple providers to the Internet. The lead time on calling up a =
different carrier and saying "I need a gigabit connection to the =
Internet" would probably be 90-120 days. And then you get to go thru the =
contracts/negotiations and MSAs. You don't just flip. In smaller =
operations I understand. But I was simply saying that it's not always =
that easy. If I went to my boss and said one of our carriers sucks and =
we should dump them he would just laugh and throw me out.
> =20
That depends on where you are. If you have a router in one or more of =
the many "carrier hotels" around the world, you can usually order a new =
Gig-E cross-connect with service in less than a week. If you need to =
have a circuit engineered, then, 30-90 days is probably about right. If =
you need to have facilities installed to provide said circuit, it can be =
as much as 180 days.

However, I don't think the point was "disconnect them tomorrow". I think =
the point was "If the impact is that severe, the sooner you start the =
new provider process, the sooner you get relief."

> 1. What are the SLAs with the carrier in question? Do you have them =
clearly defined? Are they out of SLA? If so, what compensation is =
entitled based on violation of said SLA?

99.99% of all SLAs are a pittance of money refunded IF you jump through =
extreme hoops to collect. They are rarely sufficient to resolve
or even compensate for outages.

> =20
> 2. What trending are you doing to document the failures in SLA of the =
carrier in question? Do we have a documented pattern of poor performence =
by using that trending?
> =20
> 3. What are our contractual or legal options based on items 1 and 2?
> =20
> 4. Don't forget about the Layer8 (political) factor. If your telco =
manager is buddies with the carrier then you have to double your =
documentation against them. Some companies spend tens of millions a =
month on circuits. You better be ready to justify yourself.=20

Yeah, this is usually the biggest problem.

Owen

> =20
> =20
>  -Hammer-
> =20
> "I was a normal American nerd."
> -Jack Herer
> =20
> =20
>=20
>=20
>=20
> On Tue, Feb 22, 2011 at 12:38 PM, Owen DeLong <owen@delong.com> wrote:
> Assuming that he has provider independent space (why run full BGP =
feeds if you
> are not multihomed?), then, actually it's about on par and less =
disruptive in
> general. Add new provider, wait a  day or two, then disconnect old =
provider.
>=20
> If he's using provider assigned space, then, the big hurdle is =
switching to provider
> independent (requires a renumber), but, that's a good idea for a =
variety of reasons.
>=20
> I would hardly call the type and frequency of outages described a =
"whim" when
> using that as a reason to change providers. Sounds like he is =
suffering
> severe impact to his business.
>=20
> Owen
>=20
> On Feb 22, 2011, at 10:15 AM, Hammer wrote:
>=20
> > I'm not argueing that at all. But it wasn't relevent to the question =
at
> > hand. And depending on the scale of your business dumping providers =
is not
> > something done on a whim. It's not like your fed up with DSL and =
want to
> > convert to Cable.
> >
> >
> > -Hammer-
> >
> > "I was a normal American nerd."
> > -Jack Herer
> >
> >
> >
> >
> >
> > On Tue, Feb 22, 2011 at 12:11 PM, Bret Clark =
<bclark@spectraaccess.com>wrote:
> >
> >> On 02/22/2011 12:23 PM, Hammer wrote:
> >>
> >>> As Max stated, you can set triggers based on thresholds that are =
monitered
> >>> via multiple methods in Cisco IOS. That way you could force the =
route down
> >>> dynamically. There's always a risk when letting the machines do =
the
> >>> thinking
> >>> but this would help in situations like this. Can't speak for other =
vendors
> >>> but I'm sure the features are similar.
> >>>
> >>> Well as someone else stated, if an upstream provider can't provide =
BGP
> >> reliably then it's time to give them the boot. Once in a year, =
okay, but
> >> beyond that, then it's time to read riot act with that provider.
> >> Bret
> >>
> >>
>=20
>=20


home help back first fref pref prev next nref lref last post