[4961] in testers

home help back first fref pref prev next nref lref last post

erratic host lookup failures

daemon@ATHENA.MIT.EDU (Nickolai Zeldovich)
Tue Jul 10 01:34:02 2001

Message-Id: <200107100533.BAA25852@pepsi-one.mit.edu>
To: testers@MIT.EDU
Date: Tue, 10 Jul 2001 01:33:54 -0400
From: Nickolai Zeldovich <kolya@MIT.EDU>

[ Apologies upfront for the rather-vague bug report, but I've been
  unable to gain much insight into this bug for a few weeks now, and
  figured I'd send a bug report of sorts in case anyone else knows
  what might be going on. ]

I've been seeing a lot (maybe 5-10 a day) of seemingly random host
lookup failures for a while now on my Ultra 1 running 9.0.  While I
suppose this could be a result of transient network failure, this
started appearing once the machine upgraded to 9.0, and I hadn't
noticed any systematic failure of such kind in 8.4.

Most often I see this bug in Netscape, whereby trying to access
a page on a certain host for the first time causes a pause, then
a message:

   Netscape is unable to locate the server <hostname>.

after which repeated attempts to load the same URL quickly fail for a
few seconds, and then the next attempt will pause, and resolve correctly.
Subsequent accesses to pages on the same host resolve immediately.

I've also seen this failure in lynx, telnet, and finger, all with
the same symptoms (pause, failure, then a time period of immediate
failure, and then another pause and correct resolution.)  In all
cases the lookup fails with the same symptoms as if NXDOMAIN was
returned.

FWIW, the bug predates the listen-on { 127.0.0.1; } change to
named.conf.

I haven't been able to find any pattern to such problems, but hosts
that I remember having these problems with include www.cnet.com,
route-views.oregon-ix.net, and www.inscoe.org.

-- kolya

home help back first fref pref prev next nref lref last post