[88900] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

RE: How do you (not how do I) calculate 95th percentile?

daemon@ATHENA.MIT.EDU (Russell, David)
Wed Feb 22 17:42:43 2006

Date: Wed, 22 Feb 2006 17:46:01 -0500
From: "Russell, David" <drussell@thrupoint.net>
To: "Jo Rhett" <jrhett@svcolo.com>, <nanog@merit.edu>
Errors-To: owner-nanog@merit.edu


This is a multi-part message in MIME format.

------_=_NextPart_001_01C63801.C0F96046
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-NAIMIME-Disclaimer: 1
X-NAIMIME-Modified: 1

I think that we have two (partially) unrelated issues in this thread: 1) =
how often you should sample and 2) what do you do with the results.=20
=20
I personally think that 5 minute sampling is so last century because it i=
s better suited for batch load types that do not change very quickly than=
 for interactive web applications. If your users' web performance is bein=
g affected by a particular link, they are going to notice it in the 10 se=
cond range. Congestion events lasting 1-3 minutes can be a problem. After=
 five minutes they have forgotten what they were doing:)
=20
How often you check the counter should be driven by how granular you want=
 to measure the network. Pick the right counter so that it does not wrap =
on you during your sampling interval.
=20
The initial downside is that you have 10-30 times as much data. Network d=
ata has chaotic (aka self-similar)  characteristics that make simple stat=
istics such as max, min or average somewhat useless.
=20
My understanding of the reason to calculate a 95th percentile is to try t=
o reduce the dataset size and to make some sense out of the random perfor=
mance data. For example, I could take some range of data and figure out t=
he 95% threshold and save that as a data point. (eg. 95% of the samples a=
re less than X Mbps).
=20
Read the counter value, compute the rate for the interval, then compute t=
he 95th % threshold for 20+ samples and save that as the value for that l=
onger period.
=20
The basic assumption is that you can ignore or not bill the 5% of the tim=
e that you had higher values. Its 6 minutes during a 10 hour business win=
dow or 15 minutes over a 24 hour period.  One could argue that 95 should =
be 98 or 92 or it matters if the 5% is a continuous.  But its a reasonabl=
e starting point for making a decision about whether link utilization is =
too high.
=20
=20
=20
David Russell
________________________________

From: owner-nanog@merit.edu on behalf of Jo Rhett
Sent: Wed 2/22/2006 1:12 PM
To: nanog@merit.edu
Subject: How do you (not how do I) calculate 95th percentile?




I am wondering what other people are doing for 95th percentile calculatio=
ns
these days.  Not how you gather the data, but how often you check the
counter? Do you use averages or maximums over time periods to create the
buckets used for the 95th percentile calculation?

A lot of smaller folks check the counter every 5 min and use that same
value for the 95th percentile.  Most of us larger folks need to check mor=
e
often to prevent 32bit counters from rolling over too often.  Are you lar=
ger
folks averaging the retrieved values over a larger period?  Using the
maximum within a larger period?  Or just using your saved values?

This is curiosity only.  A few years ago we compared the same data and th=
e
answers varied wildly.  It would appear from my latest check that it is
becoming more standardized on 5-minute averages, so I'm asking here on Na=
nog
as a reality check.

Note: I have AboveNet, Savvis, Verio, etc calculations.  I'm wondering
if there are any other odd combinations out there.

Reply to me offlist.  If there is interest I'll summarize the results
without identifying the source.

--
Jo Rhett
senior geek
SVcolo : Silicon Valley Colocation





Note: The information contained in this message may be privileged and con=
fidential and protected from disclosure. If the reader of this message is=
 not the intended recipient, or an employee or agent responsible for deli=
vering this message to the intended recipient, you are hereby notified th=
at any dissemination, distribution or copying of this communication is st=
rictly prohibited. If you have received this communication in error, plea=
se notify us immediately by replying to the message and deleting it from =
your computer. Thank you. ThruPoint, Inc.

------_=_NextPart_001_01C63801.C0F96046
Content-Type: text/HTML;
  charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-NAIMIME-Disclaimer: 1
X-NAIMIME-Modified: 1

<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; charset=3Diso-885=
9-1">=0A<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">=0A<HTML>=0A<HEA=
D>=0A=0A<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version 6.=
5.7232.11">=0A<TITLE>How do you (not how do I) calculate 95th percentile?=
</TITLE>=0A</HEAD>=0A<BODY>=0A<DIV id=3DidOWAReplyText70652 dir=3Dltr>=0A=
<DIV dir=3Dltr><FONT face=3DArial color=3D#000000 size=3D2>I think that w=
e have two =0A(partially) unrelated issues in this thread: 1) how often y=
ou should sample and =0A2) what do you do with the results. </FONT></DIV>=
=0A<DIV dir=3Dltr><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>=0A<DIV =
dir=3Dltr><FONT face=3DArial size=3D2>I personally think that 5 minute sa=
mpling =0Ais so last century because it is better suited for batch load t=
ypes that do not =0Achange very quickly than for interactive web applicat=
ions. If your users' web =0Aperformance is being affected by a particular=
 link,&nbsp;they are going to =0Anotice it in the 10 second range. Conges=
tion events lasting 1-3 minutes can =0Abe&nbsp;a problem.&nbsp;After five=
 minutes they have forgotten what they were =0Adoing:)</FONT></DIV>=0A<DI=
V dir=3Dltr><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>=0A<DIV dir=3D=
ltr><FONT face=3DArial size=3D2>How often you check the counter should be=
 =0Adriven by how granular you want to measure the network. Pick the righ=
t counter =0Aso that it does not wrap on you during your sampling interva=
l.</FONT></DIV>=0A<DIV dir=3Dltr><FONT face=3DArial size=3D2></FONT>&nbsp=
;</DIV>=0A<DIV dir=3Dltr><FONT face=3DArial size=3D2>The initial downside=
 is that you have 10-30 =0Atimes as much data. Network data&nbsp;has&nbsp=
;chaotic (aka =0Aself-similar)&nbsp;&nbsp;characteristics that make simpl=
e statistics such as =0Amax, min or average somewhat useless.</FONT></DIV=
>=0A<DIV dir=3Dltr><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>=0A<DIV=
 dir=3Dltr><FONT face=3DArial color=3D#000000 size=3D2>My understanding o=
f the =0Areason to calculate a 95th percentile is to try to reduce the da=
taset size and =0Ato make some sense out of the random performance data. =
For example, I could take =0Asome range of data and figure out the 95% th=
reshold and save that as a data =0Apoint. (eg. 95% of the samples are les=
s than X Mbps).</FONT></DIV>=0A<DIV dir=3Dltr><FONT face=3DArial size=3D2=
></FONT>&nbsp;</DIV>=0A<DIV dir=3Dltr><FONT face=3DArial size=3D2>Read th=
e counter value, compute the rate =0Afor the interval, then compute the 9=
5th % threshold for 20+ samples and save =0Athat as the value for that lo=
nger period.</FONT></DIV>=0A<DIV dir=3Dltr><FONT face=3DArial color=3D#00=
0000 size=3D2></FONT>&nbsp;</DIV>=0A<DIV dir=3Dltr><FONT face=3DArial col=
or=3D#000000 size=3D2>The basic assumption is =0Athat&nbsp;you can ignore=
 or not bill&nbsp;the 5% of the time that&nbsp;you had =0Ahigher values. =
Its 6 minutes during a 10 hour business window or 15 minutes over =0Aa 24=
 hour period.&nbsp; One could argue that 95 should be 98 or 92 or it matt=
ers =0Aif the 5% is a continuous.&nbsp; But its a reasonable starting poi=
nt for making =0Aa decision about whether link utilization is too high.</=
DIV>=0A<DIV dir=3Dltr><FONT face=3DArial color=3D#000000 size=3D2></FONT>=
&nbsp;</DIV></FONT>=0A<DIV dir=3Dltr><FONT face=3DArial color=3D#000000 s=
ize=3D2></FONT>&nbsp;</DIV>=0A<DIV dir=3Dltr><FONT face=3DArial size=3D2>=
</FONT>&nbsp;</DIV>=0A<DIV dir=3Dltr><FONT face=3DArial size=3D2>David Ru=
ssell</FONT></DIV>=0A<DIV dir=3Dltr>=0A<HR tabIndex=3D-1>=0A</DIV>=0A<DIV=
 dir=3Dltr><FONT face=3DTahoma size=3D2><B>From:</B> owner-nanog@merit.ed=
u on =0Abehalf of Jo Rhett<BR><B>Sent:</B> Wed 2/22/2006 1:12 PM<BR><B>To=
:</B> =0Ananog@merit.edu<BR><B>Subject:</B> How do you (not how do I) cal=
culate 95th =0Apercentile?<BR></FONT><BR></DIV></DIV>=0A<DIV><BR>=0A<P><F=
ONT size=3D2>I am wondering what other people are doing for 95th percenti=
le =0Acalculations<BR>these days.&nbsp; Not how you gather the data, but =
how often you =0Acheck the<BR>counter? Do you use averages or maximums ov=
er time periods to =0Acreate the<BR>buckets used for the 95th percentile =
calculation?<BR><BR>A lot of =0Asmaller folks check the counter every 5 m=
in and use that same<BR>value for the =0A95th percentile.&nbsp; Most of u=
s larger folks need to check more<BR>often to =0Aprevent 32bit counters f=
rom rolling over too often.&nbsp; Are you =0Alarger<BR>folks averaging th=
e retrieved values over a larger period?&nbsp; Using =0Athe<BR>maximum wi=
thin a larger period?&nbsp; Or just using your saved =0Avalues?<BR><BR>Th=
is is curiosity only.&nbsp; A few years ago we compared the =0Asame data =
and the<BR>answers varied wildly.&nbsp; It would appear from my latest =0A=
check that it is<BR>becoming more standardized on 5-minute averages, so I=
'm =0Aasking here on Nanog<BR>as a reality check.<BR><BR>Note: I have Abo=
veNet, =0ASavvis, Verio, etc calculations.&nbsp; I'm wondering<BR>if ther=
e are any other =0Aodd combinations out there.<BR><BR>Reply to me offlist=
=2E&nbsp; If there is =0Ainterest I'll summarize the results<BR>without i=
dentifying the =0Asource.<BR><BR>--<BR>Jo Rhett<BR>senior geek<BR>SVcolo =
: Silicon Valley =0AColocation<BR><BR></FONT></P></DIV>=0A=0A
<DIV><P><HR>
Note: The information contained in this message may be privileged and con=
fidential and protected from disclosure. If the reader of this message is=
 not the intended recipient, or an employee or agent responsible for deli=
vering this message to the intended recipient, you are hereby notified th=
at any dissemination, distribution or copying of this communication is st=
rictly prohibited. If you have received this communication in error, plea=
se notify us immediately by replying to the message and deleting it from =
your computer. Thank you. ThruPoint, Inc.
</P></DIV>
</BODY>=0A</HTML>
------_=_NextPart_001_01C63801.C0F96046--

home help back first fref pref prev next nref lref last post