[88979] in North American Network Operators' Group
Re: 95th percentile - the sociology study.
daemon@ATHENA.MIT.EDU (Bill Nash)
Mon Feb 27 16:53:47 2006
Date: Mon, 27 Feb 2006 16:45:18 -0500 (EST)
From: Bill Nash <billn@odyssey.billn.net>
To: Jo Rhett <jrhett@svcolo.com>
Cc: nanog@merit.edu
In-Reply-To: <20060227201029.GA79264@svcolo.com>
Errors-To: owner-nanog@merit.edu
On Mon, 27 Feb 2006, Jo Rhett wrote:
> All but three of the people who tried to teach me how to calculate 95th
> percentile were polite and clueful when I reminded them that I wanted the
> math, not a tutorial. A couple of misunderstanding actually wandered off
> into statistical analysis ramblings which was an amusing offset to the
> next set.
>
> Only three of the people insisted on telling me I was doing it wrong
> (never posted how we were doing it in the first place, so this was amusing)
> and tried to clue by four me into their approach.
What? Blanket assumptions of 'You are wrong because you are not $self'?
Crazy talk. That never happens.
> 1 tried to convince me that modern equipment can't handle being sampled
> more often than every 5 minutes.
I hope you disabused him/her of this notion. Generally, it's not modern
equipment that can't handle it, it's usually a (literally, not 'in my
opinion') stupid polling mechanism that isn't designed (or tuned!) to be
friendly or intelligent in its efforts. Many tools do bulk table gets
(friendlier, but still overkill, depending on the platform and port
density) or full table walks (Unga! You give me data now!), The general
non-availability of efficient bulk delivery methods on a universal basis
usually means people are implementing full walks.
Having a poller that does one poll an hour/day to inventory
administratively active interfaces (and keeping unused interfaces
disabled), and consequently only polling active interfaces for counter
increments, is probably the single most intelligent piece of logic you
could implement in a poller to gaurantee the least amount of wasted CPU on
your network hardware, especially since CPU time on an x86 database is
cheaper than router CPU. This is also handy for reconciling ifIndex shifts
where persistance is not available or feasible (storing ifIndex as a
property of ifName, not vice versa.)
Also, classifying your interface with externally applied data (Peer? Edge?
Vlan logical interface? Customer? Infrastructure?) can help you pare down
how much polling you really need to be doing, and at what interval. I'm a
big advocate of network inventory databases, for this reason. An example
of this would include using a Vlan's aggregated traffic counter instead of
the 50 individual ports that comprise it, if you don't need it for your
billing model (Unless you're taxing the customer for non-routed netbios
chatter across your backplane, which I'm fine with.) Standardized
interface naming or network discovery toolsets support automating this, as
well as encouraging engineers to keep the network tidy and labelled. This
little gem of a practice is usually the core of network management
standards.
Based on the active interface volume and CPU impact incurred by the SNMP
agent on the network device, you should be able to poll platforms on a
fairly constant basis and use sliding intervals in your averaging
processes, as requirements and router impact demand. Taking the time to
benchmark the effects of your polling at different intervals is an
engineering step that will keep your operational impact low as you scale.
As always, your mileage may vary.
Note: If you're small enough that you're still using MRTG, you can likely
just ignore everything I just wrote.
> And for those of you who can count, yes, one of the people who tried to
> teach me how to calculate 95th percentile didn't respond back either
> positively or negatively when I tried to remind him of the questions asked,
> and one was rude. (I don't blame him, after the 10th or so reply I was too)
My process for self-filtering posts to nanog:
1. Read post in thread.
2. Is post by a known windbag? If yes, delete and continue.
3. Draft response.
4. Suspend response in outbox for at least half an hour.
5. Ask, 'Does this post add anything constructive/funny to the thread?' If
no, delete and continue.
6. Ask, 'Will this post likely trigger hits on my procmail filter by the
aforementioned windbags?' If yes, flip a coin, because anything posted
with an opinion will likely cause that. (Yes, Martin, it's there, before
you respond to check.)
7. Ask, 'Was I pissed off when I wrote this?' If yes, let sit for another hour, goto 5.
This entire process used to be prefaced with '0. Subscribe to posting
list.' This was usually enough to deter a post until something annoyed me
sufficiently to get through the whole process. ;)
- billn