[129132] in North American Network Operators' Group
Re: Did your BGP crash today?
daemon@ATHENA.MIT.EDU (Jared Mauch)
Fri Aug 27 15:24:57 2010
From: Jared Mauch <jared@puck.nether.net>
In-Reply-To: <4C780F54.8070607@unfix.org>
Date: Fri, 27 Aug 2010 15:22:52 -0400
To: Jeroen Massar <jeroen@unfix.org>
Cc: NANOG list <nanog@nanog.org>
Errors-To: nanog-bounces+nanog.discuss=bloom-picayune.mit.edu@nanog.org
On Aug 27, 2010, at 3:17 PM, Jeroen Massar wrote:
> On 2010-08-27 21:13, Richard A Steenbergen wrote:
>> On Fri, Aug 27, 2010 at 01:29:15PM -0400, Jared Mauch wrote:
>>>
>>> Unknown BGP attribute 99 (flags: 240)
>>> Unknown BGP attribute 99 (flags: 240)
>>> Unknown BGP attribute 99 (flags: 240)
>>> Unknown BGP attribute 99 (flags: 240)
>>> Unknown BGP attribute 99 (flags: 240)
>>
>> Just out of curiosity, at what point will we as operators rise up
>> against the ivory tower protocol designers at the IETF and demand that
>> they add a mechanism to not bring down the entire BGP session because of
>> a single malformed attribute? Did I miss the memo about the meeting?
>> I'll bring the punch and pie.
>
> Complain to your vendor, especially C & J are having good enough
> influence on the IETF to make such a change possible.
>
>
> I can agree with tearing the session down when one encounters an
> improperly formatted message, but an unknown attribute, while the rest
> of the format of message is fine, is a silly thing to hang up on indeed.
When you are processing something, it's sometimes hard to tell if something
just was mis-parsed (as I think the case is here with the "missing-2-bytes")
vs just getting garbage. Perhaps there should be some way to "re-sync" when
you are having this problem, or a parallel "keepalive" path similar to
MACA/MCAS/MIDCAS/TCAS between the devices to talk when something bad is
happening.
- Jared