[142593] in North American Network Operators' Group
Why is IPv6 broken?
daemon@ATHENA.MIT.EDU (Bob Network)
Sat Jul 9 17:26:21 2011
From: Bob Network <networkjoe@hotmail.com>
To: <nanog@nanog.org>
Date: Sat, 9 Jul 2011 15:25:27 -0600
Errors-To: nanog-bounces+nanog.discuss=bloom-picayune.mit.edu@nanog.org
Why is IPv6 broken?
It's broken=2C first and foremost=2C because not all network providers who =
claim to be tier 1 are tier 1.
Even worse=2C some of these providers run 6to4 relays or providers to home =
users. A user has no choice which provider is running their 6to4 relay...s=
o=2C they might end up using a relay that is run by a provider who doesn't =
peer with their intended destination. I don't think the IETF saw that one =
coming. But the result is to make 6to4 even more broken. Now=2C I know so=
me people want 6to4 to die=2C but while it still exists in some form=2C use=
r experience is worse than it could be. The temporary fix is for any provi=
der to run their own 6to4 relay for their own customers (assuming that they=
themselves have full connectivity).
Right now=2C unless you buy transit from multiple tier 1s=2C and do so with=
carefully chosen tier ones=2C you have only part of the IPv6 internet. Ma=
ny tier 1s are unsuitable even as backup connections=2C since you still wan=
t your backup connection to have access to the whole internet! Good tier 2=
providers might be an excellent choice=2C sine good providers have already=
done this leg work and can monitor their providers for compliance.
A few myths...
Routing table size has nothing to do with completeness of routes. Google m=
ay be one route=2C through aggregation. And SmallCo may advertise a large =
route through one provider=2C and=2C due to traffic engineering=2C a smalle=
r route through a second one - in many cases=2C anyone that had the large r=
oute would be able to contact SmallCo=2C even without the smaller route bei=
ng present. So routing table size doesn't work. In addition=2C some provi=
ders aggregate their routing tables to reduce routing load and such. Other=
s intentionally don't or deaggregate it intentionally so that they can brag=
about having bigger routing tables. What you need to ask is: "How many /6=
4s can you get to from your network=2C and how many of these /64s are reach=
able from at least one other major provider (you don't care about internal-=
only networks=2C after all)?" They can give you that information=2C but ma=
ny won't want to.
It's also not about technical people not getting along. It's about busines=
s players trying to make money=2C but not just that either. It's also abou=
t ensuring that providers don't end up assuming more than their share of co=
sts for a link. Just because you have a common peering point doesn't mean =
that turning peering on would reduce your costs. In some cases it may incr=
ease costs tremendously=2C particularly on your long haul backbone links=2C=
because the other party would like to take advantage of an attitude of tru=
st on the internet. That's why we end up with peering policies and contrac=
ts.
What is the issue?
Let's take Hurricane. This is no different than other providers...basicall=
y=2C they want to say=2C "We shouldn't need to pay for IPv6 transit from an=
yone." This is what Cogent said on IPv4 a few years ago. Google used to s=
ay this too for IPv6=2C not sure if they are still saying it. Basically=2C=
"We know we're big enough that you won't want to screw your users by not p=
eering with us."
A small network couldn't do this tactic - a 100 node network who said to th=
e IPv4 tier 1s: "Hey=2C I'm in the Podunk Internet Exchange=2C so are you=
=2C so I'm going to peer from you so I don't have to buy any bandwidth for =
my web server (placed in the Podunk exchange). Sure=2C they would like to =
- it would save a ton of money if their site got lots of hits. I mean=2C w=
ho wouldn't want free connectivity?
In IPv6=2C we're going through what we settled years ago in IPv4 - who has =
to pay who to connect. After all=2C even free peering connections have a c=
ost in manpower=2C debugging=2C traffic engineering=2C documentation=2C etc=
.
Some players who aren't getting free interconnection to tier 1s in IPv4 wan=
t to get it in IPv6. So they've worked to attract lots of users=2C and don=
e so under the guise of "We like IPv6 and want to promote it." Others have=
not bothered with trying to attract the users=2C but have said=2C "We're t=
oo big for you to not want to give us connectivity for free=2C since it wou=
ld piss off your users if you don't" (Google did this at one point in the p=
ast=2C may still be doing it). The Google example is basically trying to u=
se a monopoly position to force business decisions.
Now=2C HE=2C Google=2C and others would want you to think=2C "Hey=2C IPv6 i=
s all new=2C and these $#@! other providers just want to make a buck on som=
ething they have no right to." Well=2C perhaps. But what they aren't sayi=
ng is=2C "We can turn on BGP for IPv6 on our existing connections to other =
providers=2C with no cost to us=2C and actually have full connectivity." T=
he issue isn't about cost today - nobody is charging extra for IPv6 in addi=
tion to IPv4 on a pipe where you already buy IPv4 bandwidth. And Google an=
d HE already buy IPv4 bandwidth. What they are thinking of is the future=
=2C 15 years from now=2C when there is no IPv4 - in that future=2C IPv6 isn=
't insignificant bandwidth=2C it's everything. Wouldn't it be nice to be a=
tier 1 and not pay for that? Of course! And certainly one can argue for =
or against the current tier 1 club's exclusivity. But it's the way the int=
ernet works right now=2C for better or worse. In the meantime=2C in pursui=
t of this future=2C today's customers are screwed by these providers trying=
to position themselves to make more profit margin down the road.
Which is better for the customer? A system where they are screwed today so=
that their provider can have a better negotiating position in business dis=
cussions OR a position where they do whatever they have to take to provide =
the customer with full connectivity? (To HE's credit=2C they are giving aw=
ay transit today on IPv6=2C so it's not like you are losing anything of val=
ue by not having the full internet routing tables=2C but it's a huge reason=
to not pay HE anything in other services=2C such as data center colocation=
- go with a provider that you pay and which gives you what you pay for - f=
ull transit).
A bit about peering...
Lots of people who aren't running big networks don't understand peering. T=
hey think=2C "Doesn't this benefit everyone if everyone exchanges traffic?"=
Maybe=2C on a pure level=2C but the business doesn't work that way.
I'll give you an example. Let's say you are a little ISP=2C and located in=
Virginia=2C near a major peering point. You say=2C "All the tier 1s are t=
here=2C I can pull fiber to that peering point=2C which is only a block awa=
y=2C and have free internet=2C other than the cost of the line." So=2C let=
's say you run the line=2C and=2C let's say that all the tier 1s agree to l=
et you peer for free=2C since they want your traffic too. Now=2C let's say=
your user downloads 1=2C000 TB from a server in California=2C on Qwest's n=
etwork.
You paid=2C let's say=2C $15=2C000 for your piece of fiber going a block. =
You needed to hire contractors and buy permits and such=2C after all. So y=
ou shared in the costs of letting the user get to the server. What did Qwe=
st pay? Well=2C they dug trenches=2C pulled fiber=2C negotiated with citie=
s=2C counties=2C and states=2C paid taxes on their work=2C lit this fiber=
=2C etc. It cost a lot because they went a lot further than your one block=
. And a lot more than $15=2C000.
You say=2C "So what! Their customer benefits too!" That's true=2C but let'=
s go a bit further. Let's say you have a network that extends to Californi=
a - you by DS3s from Sprint to do it. There's some cost in that=2C but you=
r user in Virginia would need more bandwidth than your DS3s. So you decide=
NOT to peer in California=2C just in Virginia. That way you don't have to=
upgrade your lines for your Virginia user. Maybe you even legally break y=
our company into two entities=2C so that you can peer in California and Vir=
ginia both=2C but you can say with a straight face=2C "We only have Virgini=
a offices for this user - the other company is a separate entity=2C and not=
the entity that owns either the server or the end user."
In other words=2C you found a way to shift most of the traffic burden and i=
nfrastructure costs to Qwest=2C away from your user.
This is why Qwest has some sort of peering policy. Among other things=2C i=
t will require multiple exchange points=2C and Qwest will probably say they=
will send traffic to the closest peering point=2C to minimize their costs.=
You get to do the same (more on that later).
Let's say that you currently buy bandwidth from NTT - you're not big enough=
to get free peering from everyone=2C but Qwest agrees to peer with you. O=
f course Qwest and NTT also have a business relationship=2C to give each ot=
her free peering. If Qwest gives you and many other customers free peering=
=2C however=2C you'll send less traffic across NTT's network. That might b=
e good from a technical standpoint=2C but NTT now is selling you a smaller =
pipe - and making less money. In effect=2C Qwest undercut NTT's business a=
nd lowered NTT's profits on the connection. How will NTT respond to that=
=2C when they were also giving free peering to and from Qwest? Well=2C the=
y might decide that Qwest isn't a very nice partner and tell Qwest=2C "Pay =
us for transit or get lost." That could be ugly - both NTT and Qwest could=
lose=2C but Qwest=2C if they actually care about stable service=2C won't w=
ant to risk it. So generally you don't give peering to anyone who is a cus=
tomer of one of your free peers. You don't hurt their business. In fact=
=2C it's often a requirement in the peering connection=2C legally. (that s=
aid=2C you could argue whether or not there is an abuse of monopoly here...=
that's a different issue)
Going one further=2C let's say you have the server=2C and Qwest has the end=
-user. That doesn't change anything - the economics are still such that Qw=
est has the cost=2C you don't. That said=2C it's convention that the perso=
n receiving the traffic pays for most of the backhaul.
Asymmetry in the Internet:
What's the path between your host and a remote server? How do you find it?=
If you said "traceroute"=2C you might be right=2C but are probably wrong.=
You need to trace route both sides.
Every provider on the internet is trying to minimize costs. This means tha=
t you want traffic to leave your network and go to the destination network =
with as little distance traveled as possible=2C because costs go up with di=
stance. It's cheaper to increase the size of pipes within a city to get to=
a peering point than to increase your backbone pipe size. So=2C peering c=
ontracts typically specify that you dump traffic to the peer as soon as pos=
sible. That means the person receiving the traffic generally pays more. I=
t also means that any traffic that crosses an AS boundary almost certainly =
travels a completely different path each way. In many cases=2C one third p=
arty provider may be used in one direction=2C another in the other directio=
n. So seeing packet loss on your traceroute at some random tinet router do=
esn't mean that this router is the cause of any problem=2C since the return=
path for that packet from that provider's router might actually cross yet =
another network that is never transited in either direction for your networ=
k connection. (I'm ignoring that most large providers also don't always se=
nd ICMP reliably BECAUSE they limit this intentionally to spare the router =
CPU from overload - it takes router CPU to generate an ICMP TTL exceeded=2C=
but it doesn't take router CPU to forward a packet - so traceroute or ping=
indicating loss at a router doesn't mean anything in itself - the path its=
elf likely has zero percent loss).
So=2C here's the scenerio.
Let's say a user and a server are on two seperate networks=2C U (user) and =
S (server).
Let's say they both utilize transit provider T. So the path could be: U -=
- T -- S. S buys an OC12 from T=2C while U buys a T1.
But let's say that the user has a second transit provider=2C BIG=2C who is =
a free peer of T. He bought an OC3 from BIG. So there's another path betw=
een U and S: U -- BIG -- T -- S. Likely this path is much faster than U --=
T -- S.
So=2C the path for the traffic to S goes U -- T -- S.
Now=2C what path does the traffic from T's router go=2C when T's router gen=
erates an ICMP TTL exceeded in response from a traceroute from a user? Doe=
s it go straight over the T1 line=2C or does it go over the peering connect=
ion to BIG and then to the customer? The answer=2C it turns out=2C depends=
on network configuration and policy. Let's say it goes out over the T1=2C=
but the T1 is congested. It will look like the congestion is at the conne=
ction between BIG and T=2C because this is the first hop that will show pac=
ket loss. BUT...the congestion is actually at the U's connection to T=2C w=
hich is irrelevant to the actual traffic path between U and S. So the user=
=2C at this point=2C calls up BIG and T and bitches about "Your peering con=
gestion is congested" when the real problem is that traffic completely unre=
lated to the user's problem is going via a congested path that is never use=
d for connectivity between U and S.
If you add several providers into this loop=2C you can end up with a situat=
ion where traffic uses Sprint in one direction=2C but never hits a Sprint r=
outer in the other. This is actually very common. A user with slow downlo=
ads might be experiencing packet loss on the path from server to user=2C bu=
t not the other way around. In other words=2C the problem is a provider th=
at never shows up on the user's traceroute!
Remember that the providers hand off the traffic as soon as possible to=20
their peer. So=2C whoever receives the larger amount of traffic needs the
bigger cross-country (or trans-oceanic) links. If one side transmits a
T1's worth of data=2C the other side transmits an OC48's worth of data=2C=
=20
only one needs the OC48s across the country - the one receiving the=20
traffic. That's why you hear about "traffic ratios". If the traffic is ev=
en both ways=2C both sides have to pay for the same amount of cross-country=
infrastructure to carry that traffic. So most providers won't peer with s=
omeone for free that sends=2C say=2C 10 times the amount of traffic that th=
ey will receive. It would end up costing a lot of money
Back to IPv6...that's interesting=2C but what does it have to do with IPv6?
Some providers want to do away with traffic ratio policies=2C mutliple loca=
tion peering=2C not providing free services to the other's customers=2C etc=
.
THAT is why you can't ping some sites from your HE tunnel. It's not just t=
hat providers won't peer. It's also that providers have rules to keep them=
selves from getting screwed.
Certainly=2C there's ways around some of this (for example=2C traffic ratio=
s - if I make sure my network is used for the cross-country traffic I send=
=2C not yours=2C then I've addressed that concern at a bit of increased exp=
ense for myself). But it's generally not worth doing until the size of the=
providers is sufficiently large. Other things don't have a good technical=
fix=2C like not peering with your peer's customer - that's a business rule=
.
=