[45157] in North American Network Operators' Group
Re: Persistent BGP peer flapping - do you care?
daemon@ATHENA.MIT.EDU (Susan Hares)
Fri Jan 18 21:36:33 2002
Message-Id: <5.0.0.25.0.20020118213039.02c978a8@mail.nexthop.com>
Date: Fri, 18 Jan 2002 21:35:35 -0500
To: "Dickson, Brian" <brian.dickson@velocita.com>
From: Susan Hares <skh@nexthop.com>
Cc: "'nanog@merit.edu'" <nanog@merit.edu>
In-Reply-To: <AFF0F85B04E5D511A1540002B35B7D2206E9FC@exchhq00>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Errors-To: owner-nanog-outgoing@merit.edu
Brian:
Thank-you for your 2 cents. I'm gathering all the input until
Sunday night. I really appreciate your comments. I'll summarize
all the input to the list at that time, and suggest some ideas.
I'll try to boil all the input on this problem into a document that
I can post to IDR and NANOG.
Sue
PS - I'm away from email from now until Monday am. Thanks nanog folks!!
At 07:30 PM 1/17/2002 -0500, Dickson, Brian wrote:
>Here's my two cents...
>
>A good rule of thumb (possibly from RFC 822) is, be liberal in what you
>accept and strict in what you send.
>
>When applied to BGP, I would suggest that any implementation should choose a
>canonical form for constructing updates, but a parser that allows for
>rule-bending without rule-breaking.
>
>On the issue of existing vendor implementations, and how to build the specs
>to prevent meltdowns:
>
>I would suspect that during implementation, brand C routers were the victims
>during testing, and perhaps the change was made to avoid that happening.
>
>The current state of affairs is very much like the classical game-theory
>"prisoner's dilemna".
>
>The new spec should have two goals - discourage any implementation which can
>lead to meltdowns, and encourage strict adherence to the spec. The latter
>can be achieved via the former, in fact, if the mechanisms are well chosen.
>
>My suggestion would be, rather than a back-off of resetting BGP sessions,
>that first attempt strict interpretation (to insulate against completely
>insane routers), and then loose interpretation. The model is "Fool me once,
>shame on you, fool me twice, shame on me."
>
>On first receiving a bad update, reset. If upon re-establishing the session,
>the same bad update is heard, drop the bad update but keep the session up
>(along with the messages back, etc.)
>
>One additional optional behaviour I would suggest - look at the AS path
>and/or path length and/or announcing router IP address. If heard from the
>originator, drop the session (and either keep it down, or try one more time
>before requiring operator intervention); it may be the case that only these
>conditions strictly require a reset, and that all other situations may only
>require the "ignore bad routes" behaviour.
>
>Resetting BGP more than a small, finite number of times is, IMHO, a bad
>idea. After all, BGP is a stateful protocol, and state changes should be
>triggered deterministically, even if that requires operator input.
>
>Brian Dickson
>Velocita