[50969] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

Re: Max Prefixes Configured on Customer BGP

daemon@ATHENA.MIT.EDU (Chris Woodfield)
Fri Aug 16 13:04:46 2002

Date: Fri, 16 Aug 2002 12:58:41 -0400
From: Chris Woodfield <rekoil@semihuman.com>
To: Richard A Steenbergen <ras@e-gerbil.net>
Cc: Jared Mauch <jared@puck.Nether.net>, nanog@merit.edu
In-Reply-To: <20020816034117.GR53265@overlord.e-gerbil.net>
Errors-To: owner-nanog-outgoing@merit.edu



--J/dobhs11T7y2rNN
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

That's why you make sure that any incidents where max-prefix is tripped is=
=20
caught by a syslog watcher and brought to the immediate attention of whoeve=
r's=20
sitting in your NOC. Honestly, if all you're dealing with is customer BGP=
=20
session, I would propose that 90% of them don't advertise more than 10 pref=
ixes,=20
so a max-prefix number higher than, say, 100 should do for most cases. And =
for=20
that last 10%, max-prefix is a per-session configuration, so that number ca=
n always=20
be set higher. IMO, advertising 100 routes for 30 seconds is far less damag=
ing=20
than 8000 routes.

Also, don't forget about the warn option - if a customer's organic growth p=
uts=20
them close to the prefix limit, you should get a heads-up in most cases.

I recall an incident where we brought up a customer advertising around 600=
=20
routes, and sent the prefix list our upstream, who dutifully added all=20
600 routes to the prefix list, but neglected to raise their maximum-prefix =
limit=20
from 300. This, of course, had predictable results. Doh.

-C

> This isn't a terribly cisco-specific reply so I'll keep it here.
>=20
> The problem with restart systems (btw thank you cisco for finally adding
> this)  is, think about how much damage can be done by announcing 8k routes
> for the 30 seconds (or 5-10 minutes if there is a Foundry in the mix :P)
> before you get to the limit and kill the session. Now add in the damage=
=20
> caused by this happening every 15 minutes, and the dampening. Or even=20
> worse, someone who turns up more routes and happens to hit right around=
=20
> the exact number or close to it. Imagine a session which goes over by 1=
=20
> route, trips, stays down for 15 minutes, comes back up and this time has =
1=20
> less route, and noone notices the prefix limit needs to be raised. You=20
> should make sure that the restart time exceeds the number/length of flaps=
=20
> necessary to trigger dampening, which on a connect you transit is pretty=
=20
> darn hard to accurately guess.
>=20
> IMHO, using only prefix limits on a customer is actually doing them (and
> the rest of the internet that listens to your announcements) a disservice.
>=20
> A better system might be where the session is kept up (or periodically
> polled, if you want to make it obvious to the other party that there is a
> problem) without installing the routes, and kept in a "quarantine" state
> for X amount of time to make sure that things stay below a configured
> number. This would be at least a slightly better way of recovering quickly
> once the "problem" has passed, without mucking things up every 15 minutes=
=20
> in the process.
>=20
> --=20
> Richard A Steenbergen <ras@e-gerbil.net>       http://www.e-gerbil.net/ras
> PGP Key ID: 0x138EA177  (67 29 D7 BC E8 18 3E DA  B2 46 B3 D8 14 36 FE B6)

--J/dobhs11T7y2rNN
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE9XS9BqP/YiunDNcERAmpuAJ9YKpbnV+I7ZxOKIXWCZY8mDSn1XQCcDE9U
3M4Yzo2VU+JC/LOxc2rNabw=
=Mf/v
-----END PGP SIGNATURE-----

--J/dobhs11T7y2rNN--

home help back first fref pref prev next nref lref last post