[112758] in North American Network Operators' Group
Re: Shady areas of TCP window autotuning?
daemon@ATHENA.MIT.EDU (Leo Bicknell)
Tue Mar 17 11:39:33 2009
Date: Tue, 17 Mar 2009 10:39:13 -0500
From: Leo Bicknell <bicknell@ufp.org>
To: Mikael Abrahamsson <swmike@swm.pp.se>, Marian ??urkovi?? <md@bts.sk>
Mail-Followup-To: Mikael Abrahamsson <swmike@swm.pp.se>,
	Marian ??urkovi?? <md@bts.sk>, nanog@nanog.org
In-Reply-To: <20090317084739.GB68010@bts.sk>
	<alpine.DEB.1.10.0903170839530.25843@uplift.swm.pp.se>
Cc: nanog@nanog.org
Errors-To: nanog-bounces@nanog.org
--ADZbWkCsHQ7r3kzd
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
In a message written on Tue, Mar 17, 2009 at 08:46:50AM +0100, Mikael Abrah=
amsson wrote:
> In my mind, the problem is that they tend to use FIFO, not that the queue=
s=20
> are too large.
We could quickly get lost in queuing science, but at a high level you
are most correct that both are a problem.
> What we need is ~100ms of buffer and fair-queue or equivalent, at both=20
> ends of the end-user link (unless it's 100 meg or more, where 5ms buffers=
=20
> and FIFO tail-drop seems to work just fine), because 1 meg uplink (ADSL)=
=20
> and 200ms buffer is just bad for the customer experience, and if they=20
> can't figure out how to do fair-queue properly, they might as well just t=
o=20
> WRED 30 ms 50 ms (100% drop probability at 50ms) or even taildrop at 50ms.
> It's very rare today that an end user is helped by anything buffering=20
> their packet more than 50ms.
Some of this technology exists, just not where it can do a lot of
good.  Some fancier CPE devices know how to queue VOIP in a priority
queue, and elevate some games.  This works great when the cable
modem or DSL modem are integrated, but when you buy a "router" and
hook it to your provider supplied DSL or Cable Modem it's doing no
good.  I hate to suggest such a thing, but perhaps a protocol for a
modem to communicate a comitted rate to a router would be a good
thing...
I'd also like to point out, where this technology exists today it's
almost never used.  How many 2600's and 3600's have you seen
terminating T1's or DS-3's that don't have anything changed from
the default FIFO queue?  I am particularly fond of the DS-3 frame
circuits with 100 PVC's, each with 40 packets of buffer.  4000
packets of buffer on a DS-3.  No wonder performance is horrid.
In a message written on Tue, Mar 17, 2009 at 09:47:39AM +0100, Marian ??urk=
ovi?? wrote:
> Reducing buffers to 50 msec clearly avoids excessive queueing delays,
> but let's look at this from the wider perspective:
>=20
> 1) initially we had a system where hosts were using fixed 64 kB buffers
> This was unable to achieve good performance over high BDP paths
Note that the host buffer, which generally should be 2 * Bandwidth
* Delay is, well, basically unrelated to the hop by hop network
buffers.
> 2) OS maintainers have fixed this by means of buffer autotuning, where
> the host buffer size is no longer the problem.=20
>=20
> 3) the above fix introduces unacceptable delays into networks and users
> are complaining, especially if autotuning approach #2 is used
>=20
> 4) network operators will fix the problem by reducing buffers to e.g. 50 =
msec
>=20
> So at the end of the day, we'll again have a system which is unable to
> achieve good performance over high BDP paths, since with reduced buffers
> we'll have an underbuffered bottleneck in the path which will prevent full
> link untilization if RTT>50 msec. Thus all the above exercises will end up
> in having almost the same situation as before (of course YMMV).=20
This is an incorrect conclusion.  The host buffer has to wait for
an RTT for an ack to return, so it has to buffer a full RTT of data
and then some.  Hop by hop buffers only have to buffer until an
output port on the same device is free.  This is why a router with
20 10GE interfaces can have a 75 packet deep queue on each interface
and work fine, the packet only sits there until a 10GE output
interface is available (a few microseconds).
The problems are related, as TCP goes faster there is an increased
probability it will fill the buffer at any particular hop; but that
means a link is full and TCP is hitting the maximum speed for that
path anyway.  Reducing the buffer size (to a point) /does not slow/
TCP, it reduces the feedback loop time.  It provides less jitter
to the user, which is good for VoIP and ssh and the like.
However, if the hop-by-hop buffers are filling and there is lag and
jitter, that's a sign the hop-by-hop buffers were always too large.
99.99% of devices ship with buffers that are too large.
--=20
       Leo Bicknell - bicknell@ufp.org - CCIE 3440
        PGP keys at http://www.ufp.org/~bicknell/
--ADZbWkCsHQ7r3kzd
Content-Type: application/pgp-signature
Content-Disposition: inline
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (FreeBSD)
iD8DBQFJv8QhNh6mMG5yMTYRAkFMAJ0ak7hWsQ9H5cCYn7FDRKP0/IOrygCcCYOJ
EHH4FBhwcP2aMJgCy09Bqdw=
=eRrG
-----END PGP SIGNATURE-----
--ADZbWkCsHQ7r3kzd--