[105203] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

Re: DNS problems to RoadRunner - tcp vs udp

daemon@ATHENA.MIT.EDU (Jeroen Massar)
Sat Jun 14 17:55:03 2008

Date: Sat, 14 Jun 2008 23:54:49 +0200
From: Jeroen Massar <jeroen@unfix.org>
To: Scott McGrath <mcgrath@fas.harvard.edu>
In-Reply-To: <485435AE.8060609@fas.harvard.edu>
Cc: nanog@merit.edu
Errors-To: nanog-bounces@nanog.org

This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enig68FDB0710C532CC9E855A427
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable

Scott McGrath wrote:
>=20
> There is no call for insults on this list

Insults? Where? If you feel insulted by any of the comments made on this =

list by people, then you probably are indeed on the wrong list. But that =

is just me.

> - Rather thought this list was=20
> about techincal discussions affecting all of us and keeping DNS alive=20
> for the majority of our customers certainly qualifies.

[..blabber about DNS attacks over TCP..]

If I where a botnet herder and I had to take out your site and I was=20
going to pick TCP for some magical reason then I would not care about=20
your DNS servers, I would just hit your webservers, hard. I mean just=20
the 'index.html' (http://www.harvard.edu/) is 24Kb, that is excluding=20
pictures and there is bound to be larger data there which you are going=20
to send and the bots only have to say "ACK" to once in a while.

Multiply that by say a small botnet of 1M hosts, each just requests that =

24Kb file. You will have a million flows and won't have any way to rate=20
limit that or control it. Your link was already full trying to send it=20
back to the clients and next to that your server was probably not able=20
to process it in the first place. Simple, effective, nothing you can do=20
about it, except get way and way more hardware.

If somebody wants to take you out, they will take you out. Just get one=20
other box with 10GE (not too hard to do) or just get a million of them=20
with a little bit of connectivity (which is quite easy apparently)...

> We/I am more than aware of the DNS mechanisms and WHY there are there=20
> trouble is NO DNS server can handle directed TCP attacks even the root =

> servers crumbled under directed botnet activity and we have taken the=20
> decision to accept some collateral damage in order to keep services=20
> available.

"The root servers crumbled" wow, I must have missed somebody taking out=20
all the 13 separate and then individually anycasted root servers. Which=20
btw only do UDP as currently '.' is still small enough.

$ dig @a.root-servers.net. . NS +tcp
[..]
;; Query time: 95 msec
;; SERVER: 2001:503:ba3e::2:30#53(2001:503:ba3e::2:30)
;; WHEN: Sat Jun 14 23:45:52 2008
;; MSG SIZE  rcvd: 604

That is only 1 packet to 1 packet, still only 500 bytes. While your=20
little webserver would generate 24kb for that same sequence.

>    We are a well connected university network with=20
> multi-gigabit ingress and egress with 10G on Abilene  so we try to
> protect the internet from attacks originating within our borders AND we=
=20
> really feel the full wrath of botnets as we do not have a relatively=20
> slow WAN link to buffer the effects.

The whole point generally of botnets is just the Denial of Service=20
(DoS), if that is because your link is full or the upstreams link is=20
full or because the service can't service clients anymore.

But clearly, as you are blocking TCP-DNS you are DoSing yourself=20
already, so the botherders win.

Also note that Abilene internally might be 10G and in quite some places=20
even 40G, but you still have to hand it off to the rest of the world and =

those will count as those 'slow WAN' links that you think everybody else =

on this planet is behind. (Hint: 10GE is kinda the minimum for most=20
reasonably sized ISP's)

> Yes - we are blocking TCP too many problems with drone armies and we=20
> started about a year ago when our DNS servers became unresponsive for n=
o=20
> apparent reason.   Investigation showed TCP flows of hundreds of=20
> megabits/sec and connection table overflows from tens of thousands of=20
> bots all trying to simultaneously do zone transfers and failing tried=20
> active denial systems and shunning with limited effectiveness.

How is a failed AXFR going to generate a lot of traffic, unless they are =

repeating themselves over and over and over again? Thus effectively just =

packeting you?

Also, are you talking about Recursive or Authoritive DNS servers here?
Where those bots on your network, or where they remote?

> We are well aware of the host based mechanisms to control zone=20
> information,  Trouble is with TCP if you can open the connection you ca=
n=20
> DoS so we don't allow the connection to be opened and this is enforced =

> at the network level where we can drop at wire speed.

Do you mean that the hosts which do TCP are allowed to do transfers or=20
not? As in the latter case they can't generate big answers, they just=20
get 1 packet back and then end then FIN.

Note also, that if they are simply trying to overload your hosts, UDP is =

much more effective in doing that already and you have that hole wide=20
open apparently otherwise you wouldn't have DNS.

> Open to better=20
> ideas but if you look at the domain in my email address you will see we=
=20
> are a target for hostile activity just so someone can 'make their bones=
'.

It probably has nothing to do with the domain name, it more likely has=20
something to do with certain services that are available or provided on=20
your network.

> Also recall we have a comittment to openess so we would like to make TC=
P=20
> services available but until we have effective DNS DoS mitigation which=
=20
> can work with 10Gb links It's not going to happen.

You think that 10Gb is a 'fat link', amusing ;)

There are various vendors, most likely also reading on this list, who=20
can be more than helpful in providing you with all kinds of bad, but=20
also a couple of good solutions to most networking issues that you are=20
apparently having.

But the biggest issue you seem to have is not knowing what the DoS=20
kiddies want to take out and why they want to take it out.

Greets,
  Jeroen

PS: You do know that an "NS" record is not allowed to point to a CNAME I =

hope? (NS3.harvard.edu CNAME ns3.br.harvard.edu. RFC1912 2.4 ;)


--------------enig68FDB0710C532CC9E855A427
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (MingW32)

iD8DBQFIVD4qKaooUjM+fCMRAhnOAJ4osOJIG9cXS16dooFydmqDbyUkZACeMue1
rSu62pvTreGcwC3id/R6lGQ=
=ulk9
-----END PGP SIGNATURE-----

--------------enig68FDB0710C532CC9E855A427--


home help back first fref pref prev next nref lref last post