[30186] in North American Network Operators' Group
Re: exchange point media
daemon@ATHENA.MIT.EDU (Richard A. Steenbergen)
Mon Jul 17 17:26:01 2000
Date: Mon, 17 Jul 2000 17:16:31 -0400 (EDT)
From: "Richard A. Steenbergen" <ras@e-gerbil.net>
To: Mikael Abrahamsson <swmike@swm.pp.se>
Cc: nanog@nanog.org
Message-ID: <Pine.BSF.4.21.0007171707340.95155-100000@overlord.e-gerbil.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Errors-To: owner-nanog-outgoing@merit.edu
On Mon, 17 Jul 2000, Mikael Abrahamsson wrote:
> We had a discussion here a while back about exchange point media. The
> outcome was that Gigabit ethernet vendors do support jumbo frames and
> that the MTU disadvantage GE has could be overcome.
>
> Now, imagine the following scenario:
>
> We connect a router (router1)to this fictous exchange point running
> (gig)ethernet. This router does support jumbo frames and has a 8k MTU.
>
> Somewhere else on the exchange point is another router (router2), also
> connected to the same broadcast domain. This router does NOT support jumbo
> frames but has the standard 1500 MTU.
>
> What happens if router1 tries to send a packet to router2 which is
> 1500 MTU? It thinks it's perfectly valid to send an 8k packet. (PMTUd
> won't work here, we're talking layer2).
Correct, Silent L2 discard, giant frame...
> My other guess is that if the switch in between (we're probably not
> talking point-to-point-links here because this is an exchange point,
> right?) is layer3-aware (as most are today) it could/would fragment the
> packet or give a needtofrag-ICMP to the originator IP. Will any switch
> today do this? What vendors do this? (I have been told that the old DEC
> Gigaswitches will do this between FDDI and FastEth, it will fragment the
> IP packet if neccessary).
A Foundry BigIron doing L3 should, exactly as if it was a router and not a
switch, I believe. At that point there is no real technical distinction
between it and a router with lots of ethernet ports however. I'm not aware
of any exchanges doing L3...
> A third solution would be that I think I saw somewhere that some OSes
> support setting host routes where you could enter the MTU of certain
> specific IPs. This could also rectify the problem by simply
> configuring the switches for jumbo frames and then setting the default
> MTU to 1500 on routers and then people who support jumbo frames could
> include this in their perring announcements/agreements and if two
> parties do support these both then their equipment could use the
> larger frames when talking to each other over this shared medium.
FreeBSD lets you set the MTU based on the route... You could do something
like this, enabling a larger MTU for specific targets, I suppose. I'm not
aware of anyone who is doing this (or probably anyone who would,
especially at L2, without a good reason). This assumes the exchange point
has a switch capable of it.
> Another option would be to pick the other unit's MTU off of the TCP
> session enabled for the (very probable) BGP peering. I seem to
> remember that TCP involves a MTU negotiation between endpoints and
> that would mean that you implicitly get to know the MTU of all your
> peers (which are the ones you might send packets to). Any vendors
> which do a "hack" like this? This would not work if the default MTU is
> 1500 though, it would rather mean you have to have a default MTU of 8k
> (or so) and find out anyone who is not jumbo capable via the TCP
> session involved with the BGP peering.
The TCP MSS is negiotated based off the MTU, so yo cannot base the MTU off
the MSS, circular logic. I highly doubt you will ever get support for
jumbo frames auto-negotiated without first standarding the jumbo-frames.
I for one would love to see an intelligent standard realizing that 1500 is
a remarkably stupid and limiting number, and enabling us to bring new life
to public exchange point peering.
--
Richard A Steenbergen <ras@e-gerbil.net> http://www.e-gerbil.net/humble
PGP Key ID: 0x138EA177 (67 29 D7 BC E8 18 3E DA B2 46 B3 D8 14 36 FE B6)