[138812] in North American Network Operators' Group
Re: bfd-like mechanism for LANPHY connections between providers
daemon@ATHENA.MIT.EDU (Sudeep Khuraijam)
Thu Mar 17 01:33:45 2011
From: Sudeep Khuraijam <skhuraijam@liveops.com>
To: Jeff Wheeler <jsw@inconcepts.biz>
Date: Wed, 16 Mar 2011 22:33:39 -0700
In-Reply-To: <AANLkTi=jUVP+s+qA_6oO7HRQWjC5q1J1mCtNYkccoFMN@mail.gmail.com>
Cc: "nanog@nanog.org" <nanog@nanog.org>
Errors-To: nanog-bounces+nanog.discuss=bloom-picayune.mit.edu@nanog.org
On Mar 16, 2011, at 6:05 PM, Jeff Wheeler wrote:
>>There a difference of several orders of magnitude between BFD keepalive =
intervals (in ms) and BGP (in seconds) with generally configurable multipl=
iers vs. >>hold timer.
>>With Real time media and ever faster last miles, BGP hold timer may find =
itself inadequate, if not in appropriate in some cases."
>For eBGP peerings, your router must re-converge to a good state in < 9
>seconds to see an order of magnitude improvement in time-to-repair.
>This is typically not the case for transit/customer sessions."
Not so, if your goal is peer deactivation and failover. Also you miss th=
e point. Once the event is detected the rest of the process starts. I am=
talking about
event detection. One may want longer than a 30 second hold-timer but =
peer state deactivated instantly on link failure. If thats the design goal=
AND link state is not passed through, then
BFD BGP deactivation is a good choice.
>To make a risk/reward choice that is actually based in reality, you
>need to understand your total time to re-converge to a good state, and
>how much of that is BGP hold-time. You should then consider whether
>changing BGP timers (with its own set of disadvantages) is more or
>less practical than using BFD.
Yes I see that and I mentioned "in some cases" not all or most cases.
>Let's put it another way: if CPU/FIB convergence time were not a
>significant issue, do you think vendors would be working to optimize
This goes orthogonal to my point. The Table size taxes, best path algori=
thms and the speed with
which you can re-FIB &rewrite the ASICs are constant in both the cases. =
But thats post event.
>this process, that we would have concepts like MPLS FRR and PIC, and
Those are out of scope in the context of this thread and have completely di=
fferent roles.
>that each new router product line upgrade comes with a yet-faster CPU?
For things they can sell more licenses for such as 3DES, keying algorithms=
, virtual instances, other things on BGP, stuff that allow service provide=
rs to charge a lot more money
while running on common infrastructure such as MPLS & FRR and zillion othe=
r things like stateful redundancy, higher housekeeping needs, inservice upg=
rades and anything else with a list price. And its cheaper than the old c=
pu.
>Of course not. Vendors would just have said, "hey, let's get
>together on a lower hold time for BGP."
Because it would be horrible code design. Link detection is a common servi=
ce. Besides BGP process threads can run longer than min intervals for link=
. Vendors would have to write checkpoints within BGP
code to come up and service link state machine. And wait its a user co=
nfigurable checkpoint!! So came BFD. Write a simple state machine and ma=
ke it available to all protocols.
>As I stated, I'll change my opinion of BFD when implementations
>improve. I understand the risk/reward situation. You don't seem to
>get this, and as a result, your overly-simplistic view is that "BGP
>takes seconds" and "BFD takes milliseconds."
I have no doubt that you understand your risk/reward but you don't for eve=
ry other environments.
For event detection leading to a state change leading to peer deactivation,=
"my overly-simplistic view" is the fact ( not as you put it, but as it w=
as written unedited). How you want to act in response is dependent on desi=
gn.
>is that "BGP
>takes seconds" and "BFD takes milliseconds."
Thats what you read not what I wrote. I was comparing the speed of event =
detection.
Now like I said for speed of deactivation "BGP hold timer may find itself =
inadequate, if not in appropriate in some cases" in this same context. But=
as I mentioned , we don't know the pain we are trying to solve for the req=
uirements thats drove this thread in the first place. So I simply put the =
facts and a business driver.
BFD is no different than deactivating a peer based on link failure. You=
r view is that there is no case for it. My point is - it arrived yesterday=
, its just a damn hard thing to monetize upstream in transit.
>>For a provider to require a vendor instead of RFC compliance is sinful.
>Many sins are more practical than the alternatives.
Few maybe.
--
Jeff S Wheeler <jsw@inconcepts.biz<mailto:jsw@inconcepts.biz><mailto:jsw@in=
concepts.biz<mailto:jsw@inconcepts.biz>>>
Sr Network Operator / Innovative Network Concepts