[115317] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

RE: Unicast Flooding

daemon@ATHENA.MIT.EDU (Holmes,David A)
Wed Jun 17 19:26:59 2009

Date: Wed, 17 Jun 2009 16:26:21 -0700
In-Reply-To: <483E6B0272B0284BA86D7596C40D29F9C381C142F5@PUR-EXCH07.ox.com>
From: "Holmes,David A" <dholmes@mwdh2o.com>
To: "Matthew Huff" <mhuff@ox.com>, "Brian Shope" <blackwolf99999@gmail.com>,
	<nanog@nanog.org>
Errors-To: nanog-bounces+nanog.discuss=bloom-picayune.mit.edu@nanog.org

In a layer 3 switch I consider unicast flooding due to an L2 cam table =
timeout a design defect. To test vendors' L3 switches for this defect we =
have used a traffic generator to send 50-100 Mbps of pings to a device =
that does not reply to the pings, where the L3 switch was routing from =
one vlan to another to forward the pings. In defective devices the L2 =
cam table entry expires, causing the 50-100 Mbps unicast stream to be =
flooded out all ports in the destination vlan. In my view the L3 and L2 =
forwarding state machines must be synchronized such that the L3 =
forwarding continues as long as there are packets entering the L3 switch =
on one vlan, and exiting the switch on another vlan via routing. It =
seems that gratuitous arps are a workaround which serves to reset the =
cam entry timeout interval, but not an elegant solution.   =20

-----Original Message-----
From: Matthew Huff [mailto:mhuff@ox.com]=20
Sent: Wednesday, June 17, 2009 2:58 PM
To: 'Brian Shope'; 'nanog@nanog.org'
Subject: RE: Unicast Flooding

Unicast flooding is a common occurrence in large datacenters especially =
with asymmetrical paths caused by different first hop routers (via HSRP, =
VRRP, etc). We ran into this some time ago. Most arp sensitive systems =
such as clusters, HSRP, content switches etc are smart enough to send =
out gratuitous arps which eliminates the worries of increasing the =
timeouts. We haven't had any issues since we made the changes.

After debugging the problem we added "mac-address-table aging-time =
14400" to our data center switches. That syncs the mac aging time to the =
same timeout value as the ARP timeout=20

----
Matthew Huff=A0=A0=A0=A0=A0=A0 | One Manhattanville Rd
OTA Management LLC | Purchase, NY 10577
http://www.ox.com  | Phone: 914-460-4039
aim: matthewbhuff=A0 | Fax:=A0=A0 914-460-4139


> -----Original Message-----
> From: Brian Shope [mailto:blackwolf99999@gmail.com]
> Sent: Wednesday, June 17, 2009 5:33 PM
> To: nanog@nanog.org
> Subject: Unicast Flooding
>=20
> Recently while running a packet capture I came across some unicast
> flooding
> that was happening on my network.  One of our core switches didn't =
have
> the
> mac-address for a server, and was flooding all packets destined to =
that
> server.  It wasn't learning the mac-address because the server was
> responding to packets out on a different network card on a different
> switch.  The flooding I was seeing wasn't enough to cause any network
> issues, it was only a few megs, but it was something that I wanted to
> fix.
>=20
> I've ran into this issue before, and solved it by statically entering
> the
> mac-address into the cam tables.
>=20
> I want to avoid this problem in the future, and I'm looking at two
> different
> things.
>=20
> The first is preventing it in the first place.  Along those lines, =
I've
> seen
> some recommendations on-line about changing the arp and cam timeouts =
to
> be
> the same.  However, there seems to be a disagreement on which is
> better,
> making the arp timeouts match the cam table timeouts, or vice versa.
> Also,
> when talking about this, everyone seems to be only considering =
routers,
> but
> what about the timers on a firewall?  I'm worried that I might cause
> other
> issues by changing these timers.
>=20
> The second thing I'm considering is monitoring.  I'd like to setup
> something
> to monitor for any excessive unicast flooding in the future.  I
> understand
> that a little unicast flooding is normal, as the switch has to do a
> little
> bit of flooding to find out where people are.  While looking for a way
> to
> monitor this, I came across the 'mac-address-table unicast-flood'
> command on
> Cisco switches.  This looked perfect for what I needed, but apparently
> it is
> currently not an option on 6500 switches with Sup720s.  Since there
> doesn't
> appear to be an option on Cisco that monitors specificaly for unicast
> floods, I thought that maybe I could setup a server with a network =
card
> in
> promiscuous mode and then keep stats of all packets received that
> aren't
> destined for the server and that also aren't legitimate broadcasts or
> multicasts.  The only problem with that is that I don't want to have =
to
> completely custom build my own solution.  I was hoping that someone =
may
> have
> already created something like this, or that maybe there is a good
> reporting
> tool for wireshark or something that could generate the report that I
> want.
>=20
> Anyone have any suggestions on either prevention/monitoring?
>=20
> Thanks!!
>=20
> -Brian



home help back first fref pref prev next nref lref last post