[139373] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

Re: LAGing backbone links

daemon@ATHENA.MIT.EDU (Shane Amante)
Tue Apr 5 11:31:55 2011

From: Shane Amante <shane@castlepoint.net>
In-Reply-To: <BANLkTi=zXMrxG7m7mNiVtaHRnS6bFVL1EQ@mail.gmail.com>
Date: Tue, 5 Apr 2011 09:30:47 -0600
To: Payam Chychi <unclepieman@gmail.com>
Cc: nanog@nanog.org
Errors-To: nanog-bounces+nanog.discuss=bloom-picayune.mit.edu@nanog.org

Payam,

On Apr 4, 2011, at 18:17 MDT, Payam Chychi wrote:
> Hello All,
>=20
> I was wondering if anyone had any thoughts as to the best practices of
> running multiple backbone links between 2 routers.  In the past we've =
added
> additional links as needed, then simply enabled IS-IS when they were =
good to
> go.  I'd then let IS-IS handle load balancing the traffic over the two
> links.  But I know that others out there would setup a LAG once they =
had
> more than one link between two routers.  Is there a best practice?  =
Does it
> matter?  Any implications to a MPLS setup?

In general, if you're using relatively modern, medium- to higher-end =
equipment, it should "just work".  Some things to watch out for in order =
of importance:
1)  Be mindful of the number of component-links you can put into a =
single LAG.  This varies by platform.  In general, for higher-end =
routers/switches the minimum number of component-links in a single LAG =
is 16.  More recently, in the last couple of years, several vendors are =
shipping equipment and/or software that will take this up to 64x =
component-links in a single LAG.  (Depending on platform, LAG's may =
allow you to build larger virtual-links between adjacent devices =
compared to ECMP which may be limited to 8x component-links in a single =
ECMP ... but, again, that all depends on the platform type).
2)  The distribution of flows across the component-links in a single LAG =
could vary, dramatically, depending on the type of traffic you're =
pushing.  Specifically, for /Internet/ (IPv4 or IPv6) over MPLS traffic, =
you will most likely very get good load distribution given the =
pseudo-randomness of IP addresses and Layer-4 port information, (in =
particular source port's from a client toward a server).  OTOH, if you =
have traffic in [very large] PW's, then typically LSR's/switches/routers =
can't look past the MPLS labels and inner Layer-2 encapsulation to find =
granular input keys used for the load-balancing hash.  Thus, the =
load-balancing hash result will cause all traffic for a single PW VC to =
non-deterministically be placed on a single component-link in the LAG.  =
The good news is that there is hope on the horizon in the form of:
http://tools.ietf.org/html/draft-ietf-pwe3-fat-pw-05
... which, in short, expects the ingress PE to [try to] find granular =
input keys from the incoming traffic, (e.g.: find input keys from an IP =
header contained within an Ethernet frame that will be transported as a =
PW VC over your MPLS core), and create a hash of that that will get =
placed into a "FAT PW" label that sits below the PW VC label.  The idea =
is that Core LSR's would still load-balance based on the bottom-most to =
top-most MPLS labels, which should result in more even load-distribution =
of PW VC flows over component-links in a LAG.  This feature is just =
starting to appear in one vendor's equipment and will hopefully show up =
in others soon, as well.  (Please bug your vendors for this!  ;-)
3)  Depending on the vendor, you may specifically have to configure the =
device to do load-balancing over LAG's or ECMP paths, (e.g.: Juniper & =
Brocade, possibly others).  Generally, you have to configure the device =
what input keys to look for and/or what # of MPLS labels to look past =
for those input-keys, e.g.: in Juniper you configure forwarding-options =
-> hash-key -> family mpls -> labels-1, label-2, payload -> ip, etc.

Some other things to look out for:
4)  Some vendor's may use different hash algorithms for LAG vs. ECMP, so =
you may get "better" load-balancing from one compared to the other.  Ask =
your vendor for details as this may not be obvious from Lab testing.
5)  Some vendors may have a limit, of the maximum number of MPLS labels =
that they can look past to find, say, an IP payload that can be used as =
input-keys for the load hashing algorithm.  This used to be a concern =
several years ago, but in general most medium- to high-end equipment can =
look past /at least/ 3 MPLS labels, which should cover you in the more =
common cases where either:
   a)  You have IP/LDP/RSVP/RSVP-FRR, where the outermost label is a =
RSVP Bypass Label when you're [briefly] running on a Bypass; or,
   b)  You have VPN-label/LDP/RSVP, where you're moving IPVPN or 6PE, =
etc. traffic and using LDP over RSVP tunneling.

Anyway, HTH,

-shane=


home help back first fref pref prev next nref lref last post