[21324] in Athena Bugs

home help back first fref pref prev next nref lref last post

sun 9.1.18: bind lossage (failure to note A records w/ CNAMES)

daemon@ATHENA.MIT.EDU (John Hawkinson)
Wed Jan 15 23:19:44 2003

Date: Wed, 15 Jan 2003 23:19:41 -0500 (EST)
Message-Id: <200301160419.XAA05396@bart-savagewood.mit.edu>
To: bugs@MIT.EDU
cc: Angie Kelic <sly@MIT.EDU>
From: John Hawkinson <jhawk@MIT.EDU>


Angie noticed persistent inconsistent DNS problems attempting to reach
the WorldCat catalog, at"http://newfirstsearch.oclc.org/html/
webscript.html:%3Asessionid=sp03sw01-39251-db04uhof-p7rk0". I
investigated and found some problems with our bind implementation.

1. Hard to debug.

    First of all, it's hard to debug. In order to turn on debugging in
bind 9 (at least, in a useful ad hoc fashion without restarting the
server), it's necessary to use "rndc", which replaces ndc. Unfortunately,
rndc requires the use of strong encryption that must be configured, even if,
in the default installation, it only works from 127.0.0.1.

It would be nice if rndc was setup by default on athena. It should be
sufficient to run "/usr/athena/etc/rndc-confgen -a -r /path/to/entropy"
assuming there's a reasonably plausible source of randomness. Since it's
only relevent for connections from localhost, I don't think much entropy
is needed.

2. bind's internal debugging sucks.

    bind's and query logging is just about useless. If an upstream
bugreport gets passed up, perhaps this should be harped upon. In
particular, there is very little information available about responses
received to queries made by named.

3. The real problem: ignoring A records returned with CNAME records.

I started out by restarting named (to clear the cache), and then "rndc
querylog" and "rndc trace 9". Even at level 9, named debugging is of
limitted usefulness (it looks like 3 is about the actual max that's used).

I did a "dig newfirstsearch.oclc.org a" to simulate the query, and of course
it failed. Logs from named.run (from querylogging/tracing) are of limited
utility. e.g.:

Jan 15 22:07:50.748 createfetch: newfirstsearch.oclc.org A
Jan 15 22:07:50.748 fctx 2029d8: create
Jan 15 22:07:50.749 fctx 2029d8: join
Jan 15 22:07:50.749 fetch 1d9ec8 (fctx 2029d8): created
Jan 15 22:07:50.749 client @1e73c0: udprecv
Jan 15 22:07:50.749 fctx 2029d8: start
Jan 15 22:07:50.749 fctx 2029d8: try
Jan 15 22:07:50.750 fctx 2029d8: cancelqueries
Jan 15 22:07:50.750 fctx 2029d8: getaddresses
Jan 15 22:07:50.750 res 1fb5a0: dns_resolver_prime
Jan 15 22:07:50.750 res 1fb5a0: priming
Jan 15 22:07:50.751 createfetch: . NS
Jan 15 22:07:50.751 fctx 21ed58: create
Jan 15 22:07:50.751 fctx 21ed58: join
Jan 15 22:07:50.751 fetch 1d9e98 (fctx 21ed58): created
...
Jan 15 22:08:03.814 received control channel command 'dumpdb'
...
Jan 15 22:09:47.147 fctx 22b258: too many timeouts, disabling EDNS0
...

and soforth. They don't make it clear what queryies were made to what
servers and what the responses were. Instead, I turned to tcpdump.

with "tcpdump -s1500 -w/var/tmp/ns port 53", and then reading with
"tcpdump -nvvvr /var/tmp/ns". I won't include the whole thing here,
but a number of anomalous behaviors are present. I'm going to reorder
these a little bit so that replies come right with queries.


22:07:50.758436 18.187.1.77.33517 > 128.9.0.107.53: [udp sum ok] 42985
 [1au] A? newfirstsearch.oclc.org. ar: . OPT U DPsize=2048 (52) (DF)
22:07:50.843396 128.9.0.107.53 > 18.187.1.77.33517: [udp sum ok] 42985
 FormErr-% [0q] 0/0/0 (12) (DF) (ttl 52, id 0, len 40)

We ask B.ROOT-SERVERS.NET about newfirstsearch and it tells us "go
screw."  Apparently this is because of EDNS0 (RFC2671 extensions) that
it does not support. Eventually bind gives up and stops trying (see
above 22:09:47 log message). At first I thought this was cause for
concern, but upon further reflection I'm not too worried.

22:07:50.844851 18.187.1.77.33517 > 128.9.0.107.53: [udp sum ok] 38695
  A? newfirstsearch.oclc.org. (41) (DF) (ttl 25 5, id 10197, len 69)
22:07:50.930251 128.9.0.107.53 > 18.187.1.77.33517: [udp sum ok]
  38695- q: A? newfirstsearch.oclc.org. 0/9/9 ns: org . NS
  A7.NSTLD.COM., org. NS L7.NSTLD.COM., org. NS G7.NSTLD.COM., org. NS
  F7.NSTLD.COM., org. NS M5.NSTLD.COM., org.  NS J5.NSTLD.COM., org. NS
  I5.NSTLD.COM., org. NS C5.NSTLD.COM., org. NS E5.NSTLD.COM. ar:
  A7.NSTLD.COM. A 192.5.6.36, L7.NSTLD.COM. A 192.41.162.36,
  G7.NSTLD.COM. A 192.42.93.36, F7.NSTLD.COM. A 192.35.51.36,
  M5.NSTLD.COM. A 192.55.83 .34, J5.NSTLD.COM. A 192.48.79.34,
  I5.NSTLD.COM. A 192.43.172.34, C5.NSTLD.COM. A 192.26.92.34,
  E5.NSTLD.COM. A 192.1 2.94.34 (347) (DF) (ttl 52, id 0, len 375)

Here we ask and are correctly referred to the nameservers for ORG. Moving
right along, we ask e5.NSTLD.COM, and do the same "go screw; well, OK"
dance:

22:07:50.938583 18.187.1.77.33517 > 192.12.94.34.53: [udp sum ok]
  31039 [1au] A? newfirstsearch.oclc.org. ar: . OPT UDPsize=2048 (52)
  (DF) (ttl 255, id 1205, len 80)
22:07:51.021715 192.12.94.34.53 > 18.187.1.77.33517: [udp sum ok]
  31039 FormErr-% [0q] 0/0/0 (12) (DF) (ttl 43, id 0 , len 40)
22:07:51.023182 18.187.1.77.33517 > 192.12.94.34.53: [udp sum ok] 7319
  A? newfirstsearch.oclc.org. (41) (DF) (ttl 25 5, id 1206, len 69)
22:07:51.108290 192.12.94.34.53 > 18.187.1.77.33517: [udp sum ok]
  7319- q: A? newfirstsearch.oclc.org. 0/2/2 ns: ocl c.org. NS
  DNS.oclc.org., oclc.org. NS DNS2.oclc.org. ar: DNS.oclc.org. A
  132.174.11.252, DNS2.oclc.org. A 132.174.11.  253 (110) (DF) (ttl 43,
  id 0, len 138)

So far so good. Then, we ask DNS.OCLC.ORG, and we start to have
problems:

22:07:51.113478 18.187.1.77.33517 > 132.174.11.252.53: [udp sum ok]
  44903 [1au] A? newfirstsearch.oclc.org. ar: . OP T UDPsize=2048 (52)
  (DF) (ttl 255, id 36403, len 80)
22:07:51.166982 132.174.11.252.53 > 18.187.1.77.33517: [udp sum ok]
  44903*- q: A? newfirstsearch.oclc.org. 2/2/3 new
  firstsearch.oclc.org. CNAME firstsearch.oclc.org.,
  firstsearch.oclc.org. A 132.174.11.11 ns: oclc.org. NS dns2.oclc.o
  rg., oclc.org. NS dns.oclc.org. ar: dns.oclc.org. A 132.174.11.252,
  dns2.oclc.org. A 132.174.11.253, . OPT UDPsize=40 96 (163) (DF) (ttl
  241, id 24566, len 191)

It _looks_ good here; we are given back both newfirstsearch.oclc.org
(a CNAME) as well as firstsearch.oclc.org (the A record). But somehow
it seems that named gets the CNAME but not the A record. I don't
understand why. named then goes ballistic asking the two oclc servers
about firstsearch.oclc.org:

22:07:51.173302 18.187.1.77.33517 > 132.174.11.253.53: [udp sum ok]
  32703 [1au] A? firstsearch.oclc.org. ar: . OPT U DPsize=2048 (49) (DF)
  (ttl 255, id 1655, len 77)
22:07:53.176569 18.187.1.77.33517 > 132.174.11.252.53: [udp sum ok]
  46491 [1au] A? firstsearch.oclc.org. ar: . OPT U DPsize=2048 (49) (DF)
  (ttl 255, id 36404, len 77)
22:07:55.187915 18.187.1.77.33517 > 132.174.11.253.53: [udp sum ok]
  22555 [1au] A? firstsearch.oclc.org. ar: . OPT U DPsize=2048 (49) (DF)
  (ttl 255, id 1656, len 77)
22:07:57.196857 18.187.1.77.33517 > 132.174.11.252.53: [udp sum ok]
  36052 A? firstsearch.oclc.org. (38) (DF) (ttl 25 5, id 36405, len 66)
22:07:59.208168 18.187.1.77.33517 > 132.174.11.253.53: [udp sum ok]
  50794 A? firstsearch.oclc.org. (38) (DF) (ttl 25 5, id 1657, len 66)

and to no avail. I'm somewhat puzzled why they don't answer, actually.
Presumably they're ratelimitting us, or they don't like the EDNS0
queries but are choosing not to say so. We don't hear from either of
them for a while, actually. We try again 8 seconds later and get no
answer (so much for rate limitting):

22:08:07.216865 18.187.1.77.33517 > 132.174.11.252.53: [udp sum ok]
  35186 A? firstsearch.oclc.org. (38) (DF) (ttl 25 5, id 36406, len 66)
22:08:15.228351 18.187.1.77.33517 > 132.174.11.253.53: [udp sum ok]
  64617 A? firstsearch.oclc.org. (38) (DF) (ttl 25 5, id 1658, len 66)


I'll come back to the tcpdump traces in a moment. But at this point (actually
22:08:03), I dumped the db. Here's what we had for oclc.org:
; glue
oclc.org.               172788  NS      DNS.oclc.org.
                        172788  NS      DNS2.oclc.org.
; glue
DNS.oclc.org.           172788  A       132.174.11.252
; glue
DNS2.oclc.org.          172788  A       132.174.11.253
; authanswer
newfirstsearch.oclc.org. 43188  CNAME   firstsearch.oclc.org.


Note the lack of firstsearch.oclc.org's A record (unsurprising since
named keeps trying to ask for it). This is what I presume is the
bug. Looking a bit more closely at what tcpdump said:

22:07:51.166982 132.174.11.252.53 > 18.187.1.77.33517: [udp sum ok]
  44903*- q: A? newfirstsearch.oclc.org. 2/2/3 new
  firstsearch.oclc.org. CNAME firstsearch.oclc.org.,
  firstsearch.oclc.org. A 132.174.11.11 ns: oclc.org. NS dns2.oclc.o
  rg., oclc.org. NS dns.oclc.org. ar: dns.oclc.org. A 132.174.11.252,
  dns2.oclc.org. A 132.174.11.253, . OPT UDPsize=40 96 (163) (DF) (ttl
  241, id 24566, len 191)

2/2/3 means 2 answers, 2 nameservers (authority), and 3 additional records. 
This looks just like what we get if we replicate the query with dig:

; <<>> DiG 9.2.1 <<>> +norec @132.174.11.252 newfirstsearch.oclc.org a
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 543
;; flags: qr aa; QUERY: 1, ANSWER: 2, AUTHORITY: 2, ADDITIONAL: 2

;; QUESTION SECTION:
;newfirstsearch.oclc.org.       IN      A

;; ANSWER SECTION:
newfirstsearch.oclc.org. 43200  IN      CNAME   firstsearch.oclc.org.
firstsearch.oclc.org.   43200   IN      A       132.174.11.11

;; AUTHORITY SECTION:
oclc.org.               14400   IN      NS      dns.oclc.org.
oclc.org.               14400   IN      NS      dns2.oclc.org.

;; ADDITIONAL SECTION:
dns.oclc.org.           14400   IN      A       132.174.11.252
dns2.oclc.org.          14400   IN      A       132.174.11.253

;; Query time: 53 msec
;; SERVER: 132.174.11.252#53(132.174.11.252)
;; WHEN: Wed Jan 15 22:45:58 2003
;; MSG SIZE  rcvd: 152

If I go back and look at the named tracing and correlate by
the millisecond, it's still not helpful, but vaguely hinting:

Jan 15 22:07:51.113 resquery 2217a0 (fctx 2029d8): send
Jan 15 22:07:51.113 resquery 2217a0 (fctx 2029d8): sent
Jan 15 22:07:51.113 resquery 2217a0 (fctx 2029d8): senddone

That's the query (44903).

Jan 15 22:07:51.167 resquery 2217a0 (fctx 2029d8): response
Jan 15 22:07:51.167 fctx 2029d8: answer_response
Jan 15 22:07:51.167 fctx 2029d8: cache_message
Jan 15 22:07:51.168 fctx 2029d8: cancelquery
Jan 15 22:07:51.168 fctx 2029d8: done
Jan 15 22:07:51.168 fctx 2029d8: stopeverything
Jan 15 22:07:51.168 fctx 2029d8: cancelqueries
Jan 15 22:07:51.168 dns_adb_destroyfind on find 21ddf8
Jan 15 22:07:51.169 dns_adb_destroyfind on find 2036b0
Jan 15 22:07:51.169 fctx 2029d8: sendevents
Jan 15 22:07:51.169 fetch 1d9ec8 (fctx 2029d8): destroyfetch
Jan 15 22:07:51.169 fctx 2029d8: shutdown
Jan 15 22:07:51.170 createfetch: firstsearch.oclc.org A
Jan 15 22:07:51.170 fctx 203288: create
Jan 15 22:07:51.170 fctx 203288: join
Jan 15 22:07:51.170 fetch 1d9ec8 (fctx 203288): created
Jan 15 22:07:51.170 fctx 2029d8: doshutdown
Jan 15 22:07:51.171 fctx 2029d8: stopeverything
Jan 15 22:07:51.171 fctx 2029d8: cancelqueries
Jan 15 22:07:51.171 fctx 2029d8: destroy

And that's all the response. It is worth noting, I guess that it
says "createfetch: firstsearch.oclc.org A" which implies it is
doing something with that data. But of fourse, it doens't seem to be.

"bleh."

Eventually (minutes later), it starts to work. Presumably because it
has started to ask for firstsearch.oclc.org explicitly, and it finally
gives up trying to do EDNS0 and so the query is actually answered.

What to do? "Back to bind 8!". (?)
Another choice would be to simply disable EDNS0. It seems one can use:

        server {
                edns no;
        };

to do the deed. Perhaps we could turn it on when the world
doesn't suck so much?



Hmm. Another complication is that this problem is not 100%
repeatable. That is, sometimes, just after restarting named, it
successfully gets a reply to the explicit query for
firstearch.oclc.org. This seems to tear a whole in the theory that
dns.oclc.org doesn't like EDNS. For instance, a "host firsearch.oclc.org"
right after restarting bind ("tcpdump -vvs1500 net 132.174.11 and port 53"):

23:12:23.205751 bart-savagewood.mit.edu.33850 > dns.oclc.org.domain:
  18332 [b2&3=0x10] [1au] A? newfirstsearch.oclc.org. (52) (DF) (ttl
  255, id 58051)
23:12:23.265132 dns.oclc.org.domain > bart-savagewood.mit.edu.33850:
  18332*- q: newfirstsearch.oclc.org. 2/2/3
  newfirstsearch.oclc.org. CNAME firstsearch.oclc.org.,
  firstsearch.oclc.org. A firstsearch.oclc.org (163) (DF) (ttl 241, id
  15093)
23:12:23.266470 bart-savagewood.mit.edu.33850 > dns2.oclc.org.domain:
  20930 [b2&3=0x10] [1au] A? firstsearch.oclc.org. (49) (DF) (ttl 255,
  id 44404)
23:12:25.271551 bart-savagewood.mit.edu.33850 > dns.oclc.org.domain:
  21556 [b2&3=0x10] [1au] A? firstsearch.oclc.org. (49) (DF) (ttl 255,
  id 58052)
23:12:25.317455 dns.oclc.org.domain > bart-savagewood.mit.edu.33850:
  21556*- q: firstsearch.oclc.org. 1/2/3 firstsearch.oclc.org. A
  firstsearch.oclc.org (134) (DF) (ttl 241, id 15094)

So now I'm even more at a loss. "bleh."
At least the named bug is consistent.

--jhawk

home help back first fref pref prev next nref lref last post