[98680] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

Re: Extreme congestion (was Re: inter-domain link recovery)

daemon@ATHENA.MIT.EDU (Sean Donelan)
Wed Aug 15 12:08:41 2007

Date: Wed, 15 Aug 2007 11:59:54 -0400 (EDT)
From: Sean Donelan <sean@donelan.com>
To: Fred Baker <fred@cisco.com>
cc: Stephen Wilcox <steve.wilcox@packetrade.com>, Chengchen Hu <huc@ieee.org>,
        nanog <nanog@merit.edu>
In-Reply-To: <C761DCEF-57AD-41C9-807F-03300F5FCE55@cisco.com>
Errors-To: owner-nanog@merit.edu


On Wed, 15 Aug 2007, Fred Baker wrote:
> On Aug 15, 2007, at 8:35 AM, Sean Donelan wrote:
>> Or should IP backbones have methods to predictably control which IP 
>> applications receive the remaining IP bandwidth?  Similar to the telephone 
>> network special information tone -- All Circuits are Busy.  Maybe we've 
>> found a new use for ICMP Source Quench.
>
> Source Quench wouldn't be my favored solution here. What I might suggest is 
> taking TCP SYN and SCTP INIT (or new sessions if they are encrypted or UDP) 
> and put them into a lower priority/rate queue. Delaying the start of new work 
> would have a pretty strong effect on the congestive collapse of the existing 
> work, I should think.

I was joking about Source Quench (missing :-), its got a lot of problems.

But I think the fundamental issue is who is responsible for controlling 
the back-off process?  The edge or the middle?

Using different queues implies the middle (i.e. routers).  At best it 
might be the "near-edge," and creating some type of shared knowledge
between past, current and new sessions in the host stacks (and maybe
middle-boxes like NAT gateways).

How fast do you need to signal large-scale back-off over what time period?
Since major events in the real-world also result in a lot of "new" 
traffic, how do you signal new sessions before they reach the affected
region of the network?  Can you use BGP to signal the far-reaches of
the Internet that I'm having problems, and other ASNs should start slowing
things down before they reach my region (security can-o-worms being 
opened).


home help back first fref pref prev next nref lref last post