[115313] in North American Network Operators' Group
RE: Unicast Flooding
daemon@ATHENA.MIT.EDU (Matthew Huff)
Wed Jun 17 17:59:12 2009
From: Matthew Huff <mhuff@ox.com>
To: 'Brian Shope' <blackwolf99999@gmail.com>, "'nanog@nanog.org'"
<nanog@nanog.org>
Date: Wed, 17 Jun 2009 17:58:23 -0400
In-Reply-To: <295271b90906171432h431f9867td53ded86f2a787f@mail.gmail.com>
Errors-To: nanog-bounces+nanog.discuss=bloom-picayune.mit.edu@nanog.org
Unicast flooding is a common occurrence in large datacenters especially wit=
h asymmetrical paths caused by different first hop routers (via HSRP, VRRP,=
etc). We ran into this some time ago. Most arp sensitive systems such as c=
lusters, HSRP, content switches etc are smart enough to send out gratuitous=
arps which eliminates the worries of increasing the timeouts. We haven't h=
ad any issues since we made the changes.
After debugging the problem we added "mac-address-table aging-time 14400" t=
o our data center switches. That syncs the mac aging time to the same timeo=
ut value as the ARP timeout=20
----
Matthew Huff=A0=A0=A0=A0=A0=A0 | One Manhattanville Rd
OTA Management LLC | Purchase, NY 10577
http://www.ox.com | Phone: 914-460-4039
aim: matthewbhuff=A0 | Fax:=A0=A0 914-460-4139
> -----Original Message-----
> From: Brian Shope [mailto:blackwolf99999@gmail.com]
> Sent: Wednesday, June 17, 2009 5:33 PM
> To: nanog@nanog.org
> Subject: Unicast Flooding
>=20
> Recently while running a packet capture I came across some unicast
> flooding
> that was happening on my network. One of our core switches didn't have
> the
> mac-address for a server, and was flooding all packets destined to that
> server. It wasn't learning the mac-address because the server was
> responding to packets out on a different network card on a different
> switch. The flooding I was seeing wasn't enough to cause any network
> issues, it was only a few megs, but it was something that I wanted to
> fix.
>=20
> I've ran into this issue before, and solved it by statically entering
> the
> mac-address into the cam tables.
>=20
> I want to avoid this problem in the future, and I'm looking at two
> different
> things.
>=20
> The first is preventing it in the first place. Along those lines, I've
> seen
> some recommendations on-line about changing the arp and cam timeouts to
> be
> the same. However, there seems to be a disagreement on which is
> better,
> making the arp timeouts match the cam table timeouts, or vice versa.
> Also,
> when talking about this, everyone seems to be only considering routers,
> but
> what about the timers on a firewall? I'm worried that I might cause
> other
> issues by changing these timers.
>=20
> The second thing I'm considering is monitoring. I'd like to setup
> something
> to monitor for any excessive unicast flooding in the future. I
> understand
> that a little unicast flooding is normal, as the switch has to do a
> little
> bit of flooding to find out where people are. While looking for a way
> to
> monitor this, I came across the 'mac-address-table unicast-flood'
> command on
> Cisco switches. This looked perfect for what I needed, but apparently
> it is
> currently not an option on 6500 switches with Sup720s. Since there
> doesn't
> appear to be an option on Cisco that monitors specificaly for unicast
> floods, I thought that maybe I could setup a server with a network card
> in
> promiscuous mode and then keep stats of all packets received that
> aren't
> destined for the server and that also aren't legitimate broadcasts or
> multicasts. The only problem with that is that I don't want to have to
> completely custom build my own solution. I was hoping that someone may
> have
> already created something like this, or that maybe there is a good
> reporting
> tool for wireshark or something that could generate the report that I
> want.
>=20
> Anyone have any suggestions on either prevention/monitoring?
>=20
> Thanks!!
>=20
> -Brian