[270] in Zephyr_Bugs
case study: sticky zephyr subs
daemon@ATHENA.MIT.EDU (Marc Horowitz)
Fri Apr 19 06:46:17 1991
To: bugs@ATHENA.MIT.EDU
Cc: bug-zephyr@ATHENA.MIT.EDU
Reply-To: Marc Horowitz <marc@ATHENA.MIT.EDU>
Date: Fri, 19 Apr 91 06:45:55 EDT
From: Marc Horowitz <marc@ATHENA.MIT.EDU>
This is a long one. I didn't use sendbug because this bug report
affects machines in ways sendbug would rather not think about :-).
Machines involved are described in the bug report.
Questions for the zephyr team are at the end. Have fun tracking this
one down :-)
Marc
From the zephyr dump on neskaya, the server which e40-008-11 was using
at the time:
2476 (mickey@ATHENA.MIT.EDU):
'operations' 'message' ''
'message' 'personal' 'mickey@ATHENA.MIT.EDU'
'mail' 'pop' 'mickey@ATHENA.MIT.EDU'
'mail' 'popret' 'mickey@ATHENA.MIT.EDU'
'login' 'glquick@athena.mit.edu' ''
'login' 'mrappa@eagle.mit.edu' ''
'login' 'lmui@athena.mit.edu' ''
'login' 'kkyang@athena.mit.edu' ''
'login' 'wkchan@athena.mit.edu' ''
'login' 'detlev@athena.mit.edu' ''
'login' 'chanh@athena.mit.edu' ''
'login' 'vijayb@athena.mit.edu' ''
'login' 'zrkhan@athena.mit.edu' ''
'login' 'joshua@athena.mit.edu' ''
'login' 'chclee@athena.mit.edu' ''
'login' 'psheu@athena.mit.edu' ''
'login' 'fmeng@athena.mit.edu' ''
'filsrv' 'themis.mit.edu' ''
'filsrv' 'themis.mit.edu:/u4/lockers/mickey' ''
'filsrv' 'jason.mit.edu' ''
'filsrv' 'jason.mit.edu:/u1/bitbucket' ''
This is one of 62 sets of subscriptions from this host, even though
there were only 4 zephyr clients running. There was no location
information for mickey.
From the wtmp.1 file (the file was generated on a vax, but od -cx on an rt)
0013740 I T . E E 251 \b ( t t y p 1 \0 \0 \0
4954 2e45 45a9 0828 7474 7970 3100 0000
0013760 m i c k e y \0 \0 Z I T I . M I T
6d69 636b 6579 0000 5a49 5449 2e4d 4954
0014000 . E D U \0 \0 \0 \0 304 251 \b ( t t y p
2e45 4455 0000 0000 c4a9 0828 7474 7970
0014560 023 255 \b ( t t y p 1 \0 \0 \0 \0 \0 \0 \0
13ad 0828 7474 7970 3100 0000 0000 0000
0014600 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000 0000 0000 0000 0000 0000 0000 0000
0014620 \0 \0 \0 \0 ; 255 \b ( | \0 \0 \0 \0 \0 \0 \0
0000 0000 3bad 0828 7c00 0000 0000 0000
Translation: (by hand, because /usr/etc/ac dumps core)
mickey, logged in at Sun Apr 14 15:13:08 1991
logged out at Sun Apr 14 15:27:55 1991
Now:
> date
Fri Apr 19 05:41:05 EDT 1991
That's almost five days that the subs are sticking around in the
server.
output from ``netstat -a | grep udp'' :
udp 0 0 *.4076 *.*
udp 0 0 *.3672 *.*
udp 0 0 *.3142 *.*
udp 0 0 *.3069 *.*
udp 0 0 *.2507 *.*
udp 0 0 *.2476 *.*
udp 0 0 *.1183 *.*
udp 0 0 *.4545 *.*
udp 0 0 *.1018 *.*
udp 0 0 *.1019 *.*
udp 0 0 *.1020 *.*
udp 0 0 *.1021 *.*
udp 0 0 *.atest6 *.*
udp 0 0 *.zephyr-h *.*
udp 0 0 *.1023 *.*
udp 0 0 *.167 *.*
udp 0 0 *.snmp *.*
udp 0 0 *.daytime *.*
udp 0 0 *.time *.*
udp 0 0 *.ntalk *.*
udp 0 0 *.talk *.*
udp 0 0 *.athena-t *.*
udp 0 0 *.syslog *.*
udp 0 0 *.afs3-cal *.*
udp 0 0 E40-008-11.MIT.E.names *.*
udp 0 0 localhost.nameserv *.*
udp 0 0 *.nameserv *.*
The 6th line was keith's zwgc which was run at login time.
Now, keith (solo26) logged in on e40-008-11, but his zwgc didn't get
any subs (SERVNAK). Unclear if it got location info. In fact, this
zwgc was not in the subscription database at all. But, zwriting
mickey sent a message to keith's zwgc. keith got pop notifications,
messages, and other random stuff.
Apparently, subs are sticking, locations are not, and they're sticking
around for a LONG time. Perhaps the easy solution for now is to have
subs expire in twice the expiration time of the ticket used to acquire
the sub.
Things to check, when I don't have a million problem sets:
1) What happens when a sub comes in on a port which is already
allocated? Does it fail? with SERVNAK? Is the original subscription
cancelled? When this happens, does the server ping the client? Does
the server log this? Can a ping work, since there will now be a
client listening?
2) When does the server timeout a client? Why isn't it happening? If
five days later, mickey was receiving mail pings, it's reasonable to
assume that he had been the entire time that there was no client
listening. How are zmailpings sent? UNACKED? UNSAFE?
3) How did the user logout? What does zwgc do in this case? It
should be noted that the user had a temp account, as themis:u4 was
down at the time of the login.