[98715] in North American Network Operators' Group
Re: Extreme congestion (was Re: inter-domain link recovery)
daemon@ATHENA.MIT.EDU (Sean Donelan)
Thu Aug 16 03:21:33 2007
Date: Thu, 16 Aug 2007 02:48:08 -0400 (EDT)
From: Sean Donelan <sean@donelan.com>
To: Fred Baker <fred@cisco.com>
cc: nanog <nanog@merit.edu>
In-Reply-To: <2FD4EA44-A9FB-4AA1-AD0E-280B18BD5305@cisco.com>
Errors-To: owner-nanog@merit.edu
On Wed, 15 Aug 2007, Fred Baker wrote:
> On Aug 15, 2007, at 8:39 PM, Sean Donelan wrote:
>> On Wed, 15 Aug 2007, Fred Baker wrote:
>>> So I would suggest that a third thing that can be done, after the other
>>> two avenues have been exhausted, is to decide to not start new sessions
>>> unless there is some reasonable chance that they will be able to
>>> accomplish their work.
>>
>> I view this as part of the flash crowd family of congestion problems, a
>> combination of a rapid increase in demand and a rapid decrease in capacity.
>
> In many cases, yes. I know of a certain network that ran with 30% loss for a
> matter of years because the option didn't exist to increase the bandwidth.
> When it became reality, guess what they did.
> That's when I got to thinking about this.
Yeah, necessity is always the mother of invention. I first tried rate
limiting the TCP SYNs with the Starr/Clinton report. It worked great
for a while, but then the SYN-flood started backing up not only on the
"congested" link, but also started congesting in other the peering
networks (those were the days of OC3 backbones and head-of-line blocking
NAP switches). And then the server choked....
So that's why I keep returning to the need to pushback traffic a couple
of ASNs back. If its going to get dropped anyway, drop it sooner.
Its also why I would really like to try to do something about the
woodpecker hosts that think congestion means try more. If the back
off slows down the host re-trying, its even further pushback.