[179342] in North American Network Operators' Group
Re: BGP offloading (fixing legacy router BGP scalability issues)
daemon@ATHENA.MIT.EDU (=?utf-8?Q?=C5=81ukasz_Bromirski?=)
Thu Apr 9 08:57:57 2015
X-Original-To: nanog@nanog.org
From: =?utf-8?Q?=C5=81ukasz_Bromirski?= <lukasz@bromirski.net>
In-Reply-To: <CAKCUjRWdWoMv2YKt5z6Qb3R+noM2AFVnEPhuOfdZ6pFRPQi70Q@mail.gmail.com>
Date: Thu, 9 Apr 2015 14:56:36 +0200
To: frederik@kriewitz.eu
Cc: nanog@nanog.org
Errors-To: nanog-bounces@nanog.org
Hi Frederik,
> On 09 Apr 2015, at 13:24, Frederik Kriewitz <frederik@kriewitz.eu> =
wrote:
>=20
> Thank you very much for all your responses.
>=20
> First of all, the problems we see are really RIB (Processor memory)
> and CPU related.
> The TCAM/FIB limits are properly configured. =46rom the FIB capacity
> view they should last a couple of more years. Software routing doesn't
> cause the problem.
> The most extreme case of Cisco 6500/SUP720 abuse I'm aware of is a
> setup with 4 full table transit connections + 2 RR sessions + ~20
> peerings, no downstreams. Besides the IPv4 and IPv6 peerings it's
> pretty much only handling a small amount of OSPF and MPLS (<5k
> prefixes ~500 routers). No netflow or any other memory hog. Under
> normal condition it's running at 20% CPU and 90% processor memory
> (1G/SUP720 XL).
The main limit here apart from the rather slow CPU for RP is
the amount of memory you can have. I=E2=80=99d setup a CSR1000v as RR
and offload the 6500 from the control-plane completely. It=E2=80=99s =
nice
box to do very fast hardware forwarding as long as the FIB fits
in the TCAMs, which it seems it does in your scenario.
> In case a session with a lot of prefixes (e.g. a transit) fails, it
> takes up to 5 minutes for the BGP Router process to recompute the RIB,
> etc.. During that time it's running at 100% CPU. Low priority
> processes are completely ignored (e.g. SNMP based monitoring stops
> working). Occasionally it even drops OSPF neighbours or other BGP
> sessions due to expired hold timers causing further havoc.
You can tune this with process time tweaks.
> Applying a /22 filter was suggested. In order to actually safe the RIB
> memory we would have to disable soft-reconfiguration on the
> corresponding sessions.
> I don't like that option for various reasons as it trades less memory
> usage for longer convergence times and significant bigger impacts on
> route map updates.
> Due to the IPv4 exhaustion we expect to see more small prefixes in the
> future which can't be aggregated (considering the AS path). Simply
> dropping them would result in less optimal routing.
If you have to filter somewhere on something, I=E2=80=99d rather try to =
filter
by AS_PATH (neighbors, etc) than prefix lengths.
--=20
"There's no sense in being precise when | =C5=81ukasz =
Bromirski
you don't know what you're talking | jid:lbromirski@jabber.org
about." John von Neumann | http://lukasz.bromirski.net