[45459] in North American Network Operators' Group
Re: representativeness of flow data based on samples
daemon@ATHENA.MIT.EDU (Peter Phaal)
Fri Feb 1 17:29:48 2002
Reply-To: <Peter_Phaal@inmon.com>
From: "Peter Phaal" <Peter_Phaal@inmon.com>
To: <nanog@merit.edu>
Date: Fri, 1 Feb 2002 14:29:01 -0800
Message-ID: <001201c1ab6f$da642400$3200000a@xo.com>
MIME-Version: 1.0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Errors-To: owner-nanog-outgoing@merit.edu
On Wed Jan 30 14:04:40 2002, Joe Abley wrote:
>There are a few vendors who now provide traffic export from high-speed
>interfaces by sampling those interfaces at a particular rate, and
>using the sampled packets to populate the per-flow counters, rather
>than looking at every packet.
>Does anybody here know of recent research with real internet traffic
>which compares different sample rates wrt the representativeness of
>the resulting flow data?
On Wed Jan 30 23:50:11 2002, Fred True replied:
| You might find this related talk useful:
| http://www.research.att.com/~duffield/pubs/usage-imw2001.pdf
While the Duffield talk mentions packet sampling, it is primarily concerned
with sampling flow records in order to reduce the post-processing overhead
(i.e. it addresses the accuracy of sampling exported netflow records, rather
than the accuracy of netflow records generated using packet sampling).
Here are a few references that address the issue of packet sampling
accuracy:
http://www.inmon.com/PDF/sFlowBilling.pdf
http://www.hpl.hp.com/techreports/92/HPL-92-35.html
http://www.caida.org/outreach/papers/1993/asmw/
I don't know of any other published studies. However, I have been involved
in a number of unpublished tests in which sampling was demonstrated to
produce valid results with sufficient accuracy (provided that suitable
sampling rates and aggregation periods are selected).
Peter