[92] in 6.033 discussion

home help back first fref pref prev next nref lref last post

usage statistics

Saltzer@ATHENA.MIT.EDU (Saltzer@ATHENA.MIT.EDU)
Thu Mar 21 08:34:27 1996

Date: Thu, 21 Mar 96 00:03:13 -0500
From: who really knows? <jbsugg@MIT.EDU>
Subject: usage statistics

how were the percentages in matt braun's usage statistics calculated? if you
take the number of TCP port 80 packets, and divide it by the total number of
TCP packets, you get a percentage considerably higher than the one given in the
stats (on the order of 35%). am i just clueless about percentages, or is there
a mistake here?

From: Matt Braun <matt@MIT.EDU>
Subject: Re: usage statistics 
Date: Thu, 21 Mar 1996 03:40:30 EST


Hi,

Good call...there was a problem with the 'percentage' column in the TCP Top 10
list. 

Because of the way I ran the statistics, each TCP packet is counted twice for
the purposes of the port breakdown, once for source and once for destination.
This was necessary so that the numbers for http counted both the requests and
the responses.  This does not affect the number of packets and the bytes as
they are categorized.  What it does mean is that when the program calculated
the percentage it was basing it on two times the number of packets, thus
resulting in half the percentage it should be.  This does mean if there was a
packet that was both from port 80 and going to port 80 it *would* be counted
twice and cause error, but that is not a likely case and should be covered by
the large error bars of 'approximation.'

I'm sorry it took so long to see this error.  This is the first time we have
run statistics like this and as such I did not have the time to thoroughly
understand the biases of the gathering software.  

I fixed the data on my page, and Jonathan updated the link on the 6.033 page
to mention the error.  

Anyway, back to my paper,


        Matt 'Candidate for a  6.033 study of his own' Braun

home help back first fref pref prev next nref lref last post