[89087] in North American Network Operators' Group
Re: shim6 @ NANOG (forwarded note from John Payne)
daemon@ATHENA.MIT.EDU (Kevin Day)
Thu Mar 2 08:32:16 2006
In-Reply-To: <OF343AD7D4.D4FC041D-ON80257125.0036634C-80257125.00379E66@btradianz.com>
Cc: nanog@nanog.org
From: Kevin Day <toasty@dragondata.com>
Date: Thu, 2 Mar 2006 07:31:40 -0600
To: Michael.Dillon@btradianz.com
Errors-To: owner-nanog@merit.edu
On Mar 2, 2006, at 4:07 AM, Michael.Dillon@btradianz.com wrote:
>> ome.
>
> When I see comments like this I wonder whether people
> understand what shim6 is all about. First of all, these
> aren't YOUR hosts. They belong to somebody else. If you
> are an access provider then these hosts belong to a customer
> that is paying you to carry packets. This customer also
> pays another ISP for the same service and the hosts
> are making decisions about whether to use your service
> or your competitors.
>
> If you are a hosting provider, then these hosts, owned
> by a third party, are making decisions about whether to
> send you packets through one or another AS.
>
> Is there something inherently wrong with independent
> organizations deciding where to send their packets?
The problem is when the *hosting company* or *ISP* is multihomed and
using shim6. The customers aren't straddling two hosting companies,
they're using a hosting company who is using shim6.
Take us as a slightly exaggerated example(using totally made up
bandwidth and prices, to protect NDAs). We have several boxes on our
network that we do not control, we don't even have a login on the
server.
In one POP we have three transit providers. NSP A gives us 10Gbps of
bandwidth, and charges us $50/mbps. NSP B is on a GigE, but we only
have a 500mbps commit. B charges us $75/mbps, but $150/mbps if we go
over our commit. NSP C is also on a GigE, but we only have a 100mbps
commit, charges us $200/mbps, and $500/mbps if we go over our commit.
I don't want a customer to touch NSP C, except for a very tiny number
of routes where A and B aren't so great. I want to use NSP B as close
to, but not going over, our commit as possible. I want everything
else to go over NSP A. If any of the three transit connections go
down, all the rules change temporarily (but hopefully not for long
enough that we get dinged for 95th-percentile)
Putting the routing decisions in the hands of the servers(that we do
not control) requires that we somehow impart this routing policy on
our customers, make them keep it up to date when we change things,
and somehow enforce that they don't break the policy. If a customer
sees that forcing traffic to go through NSP C results in a faster
connection for him, they may tweak/break the selection process of
shim6(or just ignore our policy instructions) and cost us lots of
money. We may learn from one of our providers that they lost an OC48
in our city, and can't handle our full traffic so we need to back off
immediately. Or we can know in advance that a connection is about to
go down, and want to preemptively route around it before things get
blackholed before the routers notice.
On very high traffic days, we may make 10+ manual changes to our BGP
policies to balance outbound and inbound traffic, to keep levels
under their commits while still utilizing as much of our commit as
possible. We have automated tools that make slight tweaks every 5
minutes. How can information that changes this frequently, and
involves a very large dataset (several full tables of routes) get
propagated to hundreds/thousands of hosts in a reasonable timeframe?
Are we reinventing BGP as an IGP to send route data to shim6? :) And
do we want to blow that much ram keeping a full routing table on each
server? Even compressed to only list exceptions to a default route,
my list of exceptions is still huge.
The same problems exist, on a smaller level, on enterprise networks.
Routing policies can be complex, requiring information that isn't
currently visible to end hosts, that changes frequently, and can be
very costly if anyone ignores the policy. Under current BGP-style
decisions-at-the-edges networking, it's impossible for an end user or
server to ignore routing policy. With shim6, the end nodes ARE the
routing policy. There's a lot more to many network's decision making
process of "how to select the best route" that can't be measured with
RTT or received TTLs, or anything else the end nodes can see.
Even outside the case of enterprise/hosting environments, transit
providers already send route preference data to their customers. As a
transit provider I'm able to depref/prepend/tag/etc routes to
customers that we'd rather they not use (but are free to ignore).
Under shim6, it's not really possible for your upstreams to tell you
"My connection to this network is degraded at the moment, use it only
as a last resort", where as with BGP they can prepend those routes a
dozen times or flag it with a community and you won't use it unless
you have to. Under host-based routing, all end nodes have to be made
aware of this information.
Something like shim6 works great for small or medium businesses where
they don't care about this sort of thing, their routing policies only
change when they add/drop a provider, and they don't have thousands
of customers with root access on their boxes trying to game the
system. I just don't think it's a solution for everyone.