[100662] in North American Network Operators' Group
Re: monitoring tools
daemon@ATHENA.MIT.EDU (Bill Nash)
Thu Nov 1 12:22:42 2007
Date: Thu, 1 Nov 2007 09:18:26 -0700 (MST)
From: Bill Nash <billn@billn.net>
To: Bill Fenner <fenner@gmail.com>
cc: "Nesser, Phil" <nesser@amazon.com>, "nanog@nanog.org" <nanog@nanog.org>
In-Reply-To: <ed6d469d0710311418i3904a56br201a07b52afd839a@mail.gmail.com>
Errors-To: owner-nanog@merit.edu
On Wed, 31 Oct 2007, Bill Fenner wrote:
>
> On 10/30/07, Nesser, Phil <nesser@amazon.com> wrote:
> > 2. Open Source Tools that you use or would recommend (I know the obvious smokeping, mrtg, nagios).
>
> I don't see netdisco mentioned in this space very much, but I
> recommend it for the "what is plugged into what" question - both in an
> enterprise environment ("where is this misbehaving MAC address?") and
> a data center ("which port was that server plugged into on the
> switch?").
Anecdotal evidence of the usefulness of such tools:
The environment was a pair of cat6509s running multiple gigabit
etherchannel crossconnects, with lots of gigabit and 100mbit servers on
either side, talking back and forth to each other, or up the stick to the
egress routers. I was building an inventory tool to help me track down
mislabelled or unlabelled ports, to clean up and audit the device
inventory. I notice one lonely 100 meg port bridging a large number of MAC
addresses that were homed on the *other* 6509. I mentioned it as odd in
passing to the network engineer, and was advised that my tool was probably
broken. I took it under advisement and when on about my business.
A few hours later, I discovered that I could make the sysadmins and
network engineers run around asking each other what was broken by scp'ing
a huge file between two databases on opposite switches. When I stopped my
transfer, they stopped running. Start it again, panic at the disco. Very
refreshing.
I brought up the 100 meg port bridging all those addresses, and lo an
behold, a misconfigured load balancer had somehow suborned the multi-gig
etherchannel crossconnects and was bridging everything in the one big vlan
that all the servers sat in. (That's a different story.)
- billn