[185668] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

Re: Long-haul 100Mbps EPL circuit throughput issue

daemon@ATHENA.MIT.EDU (alvin nanog)
Thu Nov 5 18:19:33 2015

X-Original-To: nanog@nanog.org
Date: Thu, 5 Nov 2015 15:19:12 -0800
From: alvin nanog <nanogml@Mail.DDoS-Mitigator.net>
To: Eric Dugas <edugas@unknowndevice.ca>
In-Reply-To: <CALKrK4na=wKSi=vf4EgPetoXJxaHTxfGt8HXYr6CE+xPk1Vy4g@mail.gmail.com>
Cc: nanog@nanog.org
Errors-To: nanog-bounces@nanog.org


hi eric

On 11/05/15 at 04:48pm, Eric Dugas wrote:
...
> Linux test machine in customer's VRF <-> SRX100 <-> Carrier CPE (Cisco
> 2960G) <-> Carrier's MPLS network <-> NNI - MX80 <-> Our MPLS network <->
> Terminating edge - MX80 <-> Distribution switch - EX3300 <-> Linux test
> machine in customer's VRF
> 
> We can full the link in UDP traffic with iperf but with TCP, we can reach
> 80-90% and then the traffic drops to 50% and slowly increase up to 90%.
 
if i was involved with these tests, i'd start looking for "not enough tcp send 
and tcp receive buffers"

for flooding at 100Mbit/s, you'd need about 12MB buffers ... 

udp does NOT care too much about dropped data due to the buffers,
but tcp cares about "not enough buffers" .. somebody resend packet# 1357902456 :-)

at least double or triple the buffers needed to compensate for all kinds of 
network whackyness: 
data in transit, misconfigured hardware-in-the-path, misconfigured iperfs, 
misconfigured kernels, interrupt handing, etc, etc

- how many "iperf flows" are you also running ??
	- running dozen's or 100's of them does affect thruput too

- does the same thing happen with socat ??

- if iperf and socat agree with network thruput, it's the hw somewhere

- slowly increasing thruput doesn't make sense to me ... it sounds like 
something is cacheing some of the data

magic pixie dust
alvin

> Any one have dealt with this kind of problem in the past? We've tested by
> forcing ports to 100-FD at both ends, policing the circuit on our side,
> called the carrier and escalated to L2/L3 support. They tried to also
> police the circuit but as far as I know, they didn't modify anything else.
> I've told our support to make them look for underrun errors on their Cisco
> switch and they can see some. They're pretty much in the same boat as us
> and they're not sure where to look at.
> 

home help back first fref pref prev next nref lref last post