[795] in linux-net channel archive
"connection refused" on recvfrom!?
daemon@ATHENA.MIT.EDU (Avery Pennarun)
Sat Jul 29 15:48:26 1995
Date: Thu, 27 Jul 1995 22:36:08 -0400 (EDT)
From: Avery Pennarun <apenwarr@foxnet.net>
To: linux-net@vger.rutgers.edu
Is this normal? Tested under Linux 1.3.10 and 1.2.11.
Quick summary: (a program to actually DO this is included somewhere below)
- use socket() to create an unconnected UDP socket
- sendto() some IP address, first to the ECHO port, then to a
nonexistent port (ie connection would be refused). Both
sendto's return successfully. Okay, I can understand that,
UDP is supposed to be an unreliable protocol.
- delay a bit to let internet do its thing
- recvfrom() on the socket returns error; errno is ECONNREFUSED (!)
and there's nothing valid in the "from" address. (ie.
we don't know who refused the connection!)
- a second recvfrom returns the ECHO packet, as expected.
It doesn't matter what target IP address you use, providing it doesn't point
to the local host.
On a SunOS 4.1.3_U1 system I tried, the sendto() a nonexistent port returned
ECONNREFUSED right away; does this mean it delayed until it got a response
back? The result code makes perfect sense and is understandable, but how
does it manage this with UDP, which isn't guaranteed to respond with
anything at all?
Under Linux, if the destination address is localhost or any other address
for the local system, behaviour is the same as SunOS; sendto() gives
ECONNREFUSED.
The problem with this behaviour is that the ECONNREFUSED takes priority over
any other received datagram; that means that (as in the example above) the
Connection Refused shows up _before_ the echo packet, regardless of the
order they were sent out. (unless, obviously, the echo packet is read
before the connection refused comes back)
This means it's absolutely impossible to determine _which_ connection was
refused, which makes the error message pretty useless. Even if the messages
did come back in the right order, given the unreliability of UDP datagrams,
making an educated guess would still be a dangerous risk.
Possible fixes:
- ideally, make sendto() return ECONNREFUSED like SunOS does. I
don't really know how they do this, though, and if it
involves a delay we're probably better off not doing it.
- otherwise, at least never return ECONNREFUSED on a recvfrom. It
wasn't the recvfrom that caused the error, after all.
This sounds like a kernel bug, unless I'm missing something important. It
greatly aggravates "ytalk" because it tries to communicate with both brands
of talk daemon and doesn't expect recvfrom to return an error unless
something REALLY bad happens (it's quite willing to accept sendto()
ECONNREFUSED, however):
- open a socket
- sendto() ports 518 and 517, in that order.
- select() on the socket (if it times out, try again)
- recvfrom() the socket; on a system like SunOS, which only has a
talk daemon on port 517, the first recvfrom returns an error
and ytalk aborts.
I include a patch below to make ytalk try the select()/recvfrom() pair again
if it gets ECONNREFUSED. I don't think it's the right solution, though
(particularly since I used a goto... yecch :))
There's more after the patch, so don't stop now :)
---------------->zing>------------------------
diff --unified ytalk-3.0pl2/socket.c ytalk-3.0pl2+ave/socket.c
--- ytalk-3.0pl2/socket.c Wed Nov 24 06:53:17 1993
+++ ytalk-3.0pl2+ave/socket.c Thu Jul 27 21:27:42 1995
@@ -428,6 +428,7 @@
if(n != sizeof(m1))
show_error("Warning: cannot write to old talk daemon");
+select_again:
tv.tv_sec = 4L;
tv.tv_usec = 0L;
sel = (1 << talkd[ntalk].fd) | (1 << talkd[otalk].fd);
@@ -444,9 +445,21 @@
for(d = 1; d <= daemons; d++)
if(sel & (1 << talkd[d].fd))
{
- out |= (1 << d);
- if(recv(talkd[d].fd, errstr, talkd[d].rlen, 0) < 0)
+ int recl;
+
+ recl=recv(talkd[d].fd, errstr, talkd[d].rlen, 0);
+ if (recl<0 && errno==ECONNREFUSED)
+ {
+ /* at least one of the daemons doesn't exist at all;
+ * but we don't know which one. We need to try to
+ * select() again.
+ */
+ goto select_again;
+ }
+ else if (recl<0)
show_error("find_daemon: recv() failed");
+
+ out |= (1 << d);
}
tv.tv_sec = 0L;
---------------->zing>------------------------
Now, here's a simple test program to demonstrate the scenario I gave
waaaaaay back at the top of this message:
---------------->zing>------------------------ badrecv.c
/* Demonstrate the inexplicable "Connection Refused" on recvfrom() */
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <stdio.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
#include <stdlib.h>
#include <memory.h>
int main()
{
struct sockaddr_in sin1,sin2,sinrecv;
int sock,length,result,sinlength;
char buffer[1024];
typedef struct sockaddr *sa;
sock=socket(AF_INET,SOCK_DGRAM,IPPROTO_UDP);
printf("socket #%d\n",sock);
sin1.sin_family=AF_INET;
sin1.sin_addr.s_addr=inet_addr("204.138.179.2");
sin1.sin_port=htons(IPPORT_ECHO);
sin2=sin1;
sin2.sin_port=htons(9999); /* a "Connection Refused" port */
strcpy(buffer,"Hello, world!\n");
result=sendto(sock,buffer,strlen(buffer)+1,0,(sa)&sin1,sizeof(sin1));
if (result<=0) perror("sendto sin1");
result=sendto(sock,buffer,strlen(buffer)+1,0,(sa)&sin2,sizeof(sin1));
if (result<=0) perror("sendto sin2");
memset(buffer,0,sizeof(buffer));
memset(&sinrecv,0,sizeof(sinrecv));
sleep(2);
for (;;)
{
printf("reading...\n");
sinlength=sizeof(sinrecv);
length=recvfrom(sock,buffer,sizeof(buffer),0,
(sa)&sinrecv,&sinlength);
printf("Received packet from %s:%d\n",
inet_ntoa(sinrecv.sin_addr),
ntohs(sinrecv.sin_port));
if (length>0)
{
printf("packet: ");
fflush(stdout);
write(1,buffer,length);
}
else
{
perror("recvfrom");
}
}
close(sock);
return 0;
}