[160478] in North American Network Operators' Group
Re: Level3 worldwide emergency upgrade?
daemon@ATHENA.MIT.EDU (Matthew Petach)
Wed Feb 6 13:23:05 2013
In-Reply-To: <20130206131015.GE20536@hijacked.us>
Date: Wed, 6 Feb 2013 10:22:52 -0800
From: Matthew Petach <mpetach@netflight.com>
To: Jonathan Towne <jtowne@slic.com>
Cc: nanog@nanog.org
Errors-To: nanog-bounces+nanog.discuss=bloom-picayune.mit.edu@nanog.org
On Wed, Feb 6, 2013 at 5:10 AM, Jonathan Towne <jtowne@slic.com> wrote:
> On Wed, Feb 06, 2013 at 07:57:06AM -0500, Alex Rubenstein scribbled:
> # The question should be more along the lines of, "why aren't you multihomed in a way that would make a 30 minute outage (which is inevitable) irrelevant to you?
>
> The fun part of this emergency maintenance in the northeast USA was that even
> folks who are multihomed felt it: Level3 managed to do this in a way that
> kept BGP sessions up but killed the ability to actually pass traffic. I'm not
> sure what they did that caused this, or whether anyone but northeast folks
> were affected by it, but it sure was neat to be effectively blackholed in and
> out of one of your provided circuits for a while.
I recommend you grab
http://kestrel3.netflight.com/2013.02.05-NANOG57-day2-afternoon-session.txt
and search for PR8361907
Richard did a very good lightning talk about why
Juniper boxes will bring up BGP but blackhole
traffic for 30 minutes to over an hour, depending
on number of BGP sessions it is handling.
His recommendation--if you don't like it, go tell
Juniper to fix that bug.
Matt