[191607] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

Re: CDN Overload?

daemon@ATHENA.MIT.EDU (Mike Hammett)
Fri Sep 23 03:25:28 2016

X-Original-To: nanog@nanog.org
Date: Thu, 22 Sep 2016 18:08:53 -0500 (CDT)
From: Mike Hammett <nanog@ics-il.net>
Cc: NANOG <nanog@nanog.org>
In-Reply-To: <35119344-8731-4C6F-A561-E2E7B43D0F70@ndsu.edu>
Errors-To: nanog-bounces@nanog.org

Do we have any contacts at Microsoft that we can talk to about this? This t=
ime around, they are the common denominator. I know people have been compla=
ining about this for longer than Windows 10 has been out, so there must be =
some other reasons why other parties we are to blame.

-----Mike HammettIntelligent Computing SolutionsMidwest Internet ExchangeTh=
e Brothers WISP

----- Original Message -----
From: Bruce Curtis <bruce.curtis@ndsu.edu>
To: Mike Hammett <nanog@ics-il.net>
Cc: Martin Hannigan <hannigan@gmail.com>, NANOG <nanog@nanog.org>
Sent: Thu, 22 Sep 2016 16:28:17 -0500 (CDT)
Subject: Re: CDN Overload?


  I have seen traffic from Microsoft in Europe to single hosts on our campu=
s that seemed to be unusually (high bps) and long.

  I don=E2=80=99t recall if the few multiple hosts I noticed this on over t=
ime were only on our campus wifi.

  If not perhaps the common factor is longer latency?  Both connects over w=
ireless and connections from Europe to the US would have longer latency.

  Perhaps this longer latency combined with some other factor is triggering=
 a but in modern TCP Congestion Control algorithms?



This mentions that there have been bugs in TCP Congestion Control algorithm=
 implementations.   Perhaps there could be other bugs that result in the de=
scried issue?

https://www.microsoft.com/en-us/research/wp-content/uploads/2016/08/ms_feb0=
7_eval.ppt.pdf


I have seen cases on our campus where too small buffers on an ethernet swit=
ch caused a Linux TCP Congestion Control algorithm to act badly resulting i=
n slower downloads than a simple algorithm that depended on dropped packets=
 rather than trying to determine window sizes etc.  The fix in that case wa=
s to increase the buffer size.  Of course buffer bloat is also known to pla=
y havoc with TCP Congestion Control algorithms.  Just wondering if some com=
bination of higher latency and another unknown variable or just a bug might=
 cause a TCP Congestion Control algorithm to think it can safely try to inc=
rease the transmit rate?


> On Sep 21, 2016, at 8:29 PM, Mike Hammett <nanog@ics-il.net> wrote:
>=20
> Thanks Marty. I have only experienced this on my network once and it was =
directly with Microsoft, so I haven't done much until a couple days ago whe=
n I started this campaign. I don't know if anyone else has brought this to =
anyone's attention. I just sent an e-mail to Owen when I saw yours.=20
>=20
>=20
>=20
>=20
> -----=20
> Mike Hammett=20
> Intelligent Computing Solutions=20
>=20
> Midwest Internet Exchange=20
>=20
> The Brothers WISP=20
>=20
> ----- Original Message -----
>=20
> From: "Martin Hannigan" <hannigan@gmail.com>=20
> To: "Mike Hammett" <nanog@ics-il.net>=20
> Cc: "NANOG" <nanog@nanog.org>=20
> Sent: Wednesday, September 21, 2016 8:19:35 PM=20
> Subject: Re: CDN Overload?=20
>=20
>=20
>=20
>=20
>=20
> Mike,=20
>=20
>=20
> I will forward to the requisite group for a look. Have you brought this t=
o our attention previously? I don't see anything. If you did, please forwar=
d me the ticket numbers or message(s) (peering@ is best) so wee can track d=
own and see if someone already has it in queue.=20
>=20
>=20
> Jared alluded to fasttcp a few emails ago. Astute man.=20
>=20
>=20
> Best,=20
>=20
>=20
> Martin Hannigan=20
> AS 20940 // AS 32787=20
>=20
>=20
>=20
>=20
>=20
> On Sep 21, 2016, at 14:30, Mike Hammett < nanog@ics-il.net > wrote:=20
>=20
>=20
>=20
>=20
> https://docs.google.com/spreadsheets/d/1Jdm0dOBf81kSnXEvVfI6ZJbWFNt5AbYUV=
8CDxGwLSm8/edit?usp=3Dsharing=20
>=20
> I have made the anonymized answers public. This will obviously have some =
bias to it given that I mostly know fixed wireless operators, but I'm hopin=
g this gets some good distribution to catch more platforms.=20
>=20
>=20
>=20
>=20
> -----=20
> Mike Hammett=20
> Intelligent Computing Solutions=20
>=20
> Midwest Internet Exchange=20
>=20
> The Brothers WISP=20
>=20
> ----- Original Message -----=20
>=20
> From: "Mike Hammett" < nanog@ics-il.net >=20
> To: "NANOG" < nanog@nanog.org >=20
> Sent: Wednesday, September 21, 2016 9:08:55 AM=20
> Subject: Re: CDN Overload?=20
>=20
> https://goo.gl/forms/LvgFRsMdNdI8E9HF3=20
>=20
> I have made this into a Google Form to make it easier to track compared t=
o randomly formatted responses on multiple mailing lists, Facebook Groups, =
etc.=20
>=20
>=20
>=20
>=20
> -----=20
> Mike Hammett=20
> Intelligent Computing Solutions=20
>=20
> Midwest Internet Exchange=20
>=20
> The Brothers WISP=20
>=20
> ----- Original Message -----=20
>=20
> From: "Mike Hammett" < nanog@ics-il.net >=20
> To: "NANOG" < nanog@nanog.org >=20
> Sent: Monday, September 19, 2016 12:34:48 PM=20
> Subject: CDN Overload?=20
>=20
>=20
> I participate on a few other mailing lists focused on eyeball networks. F=
or a couple years I've been hearing complaints from this CDN or that CDN wa=
s behaving badly. It's been severely ramping up the past few months. There =
have been some wild allegations, but I would like to develop a bit more sta=
ndardized evidence collection. Initially LimeLight was the only culprit, bu=
t recently it has been Microsoft as well. I'm not sure if there have been a=
ny others.=20
>=20
> The principal complaint is that upstream of whatever is doing the rate li=
miting for a given customer there is significantly more capacity being util=
ized than the customer has purchased. This could happen briefly as TCP adju=
sts to the capacity limitation, but in some situations this has persisted f=
or days at a time. I'll list out a few situations as best as I can recall t=
hem. Some of these may even be merges of a couple situations. The point is =
to show the general issue and develop a better process for collecting what =
exactly is happening at the time and how to address it.=20
>=20
> One situation had approximately 45 megabit/s of capacity being used up by=
 a customer that had a 1.5 megabit/s plan. All other traffic normally held =
itself within the 1.5 megabit/s, but this particular CDN sent excessively m=
ore for extended periods of time.=20
>=20
> An often occurrence has someone with a single digit megabit/s limitation =
consuming 2x - 3x more than their plan on the other side of the rate limite=
r.=20
>=20
> Last month on my own network I saw someone with 2x - 3x being consumed up=
stream and they had *190* connections downloading said data from Microsoft.=
=20
>=20
> The past week or two I've been hearing of people only having a single con=
nection downloading at more than their plan rate.=20
>=20
>=20
> These situations effectively shut out all other Internet traffic to that =
customer or even portion of the network for low capacity NLOS areas. It's a=
 DoS caused by downloads. What happened to the days of MS BITS and you didn=
't even notice the download happening? A lot of these guys think that the C=
DNs are just a pile of dicks looking to ruin everyone's day and I'm certain=
 that there are at least a couple people at each CDN that aren't that way. =
;-)=20
>=20
>=20
>=20
>=20
> Lots of rambling, sure. What do I need to have these guys collect as evid=
ence of a problem and who should they send it to?=20
>=20
>=20
>=20
>=20
> -----=20
> Mike Hammett=20
> Intelligent Computing Solutions=20
>=20
> Midwest Internet Exchange=20
>=20
> The Brothers WISP=20
>=20
>=20
>=20
>=20
>=20
>=20

---
Bruce Curtis                         bruce.curtis@ndsu.edu
Certified NetAnalyst II                701-231-8527
North Dakota State University       =20





home help back first fref pref prev next nref lref last post