[173954] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

Re: Shaw routing issue 12 Aug 2014

daemon@ATHENA.MIT.EDU (Leah Ungstad)
Thu Aug 14 15:46:54 2014

X-Original-To: nanog@nanog.org
In-Reply-To: <CAB0xJrNjOYknfHofiLnqMk=CAV1KN7hboqnH0y=JKecPKxw66g@mail.gmail.com>
Date: Thu, 14 Aug 2014 12:46:45 -0700
From: Leah Ungstad <leah.ungstad@gmail.com>
To: Pete Lumbis <alumbis@gmail.com>
Cc: "nanog@nanog.org" <nanog@nanog.org>, Geoffrey Keating <geoffk@geoffk.org>
Errors-To: nanog-bounces@nanog.org

Thanks for the info Pete, Geoffrey & Hugo!

LU


On Wed, Aug 13, 2014 at 6:07 PM, Pete Lumbis <alumbis@gmail.com> wrote:

> Yep. Most of the time I've seen this it's two data centers, both go TCAM
> exception. You reboot DC1, when it comes back up you reboot DC2. This means
> no iBGP learned routes so DC1 is fine. DC 2 is fine, until the iBGP peer
> comes back and then start all over again.
>
>
> On Wed, Aug 13, 2014 at 6:06 PM, Geoffrey Keating <geoffk@geoffk.org>
> wrote:
>
>> Pete Lumbis <alumbis@gmail.com> writes:
>>
>> > Maybe related to the 512k route issue?
>> > http://www.bgpmon.net/what-caused-todays-internet-hiccup/
>> >
>> > I've seen people reboot to recover from TCAM exception without adjusting
>> > TCAM size only to run into the issue all over again. It's a fun way to
>> > watch the problems roll around the network.
>>
>> In this case, it would probably have "helped" in the same way as
>> rebooting or waving a rubber chicken or whatever sometimes "helps": the
>> route issue was caused initially by a problem at Verizon that
>> caused them to deaggregate, which they fixed, so by the time someone had
>> identified the problem, paged someone, gotten them to the data center,
>> had a teleconference, rebooted the device, waited for it to come back
>> up...  Verizon would have fixed it, so when it came back up it'd be
>> back under 512k again.
>>
>
>

home help back first fref pref prev next nref lref last post