[17498] in Athena Bugs
(sun4, 8.3) Netscape sometimes closes stdin on streaming helper app
daemon@ATHENA.MIT.EDU (Larry Stone)
Tue Feb 1 21:12:08 2000
Date: Tue, 1 Feb 2000 21:12:03 EST
From: Larry Stone <lcs@MIT.EDU>
Reply-To: <lcs@MIT.EDU>
To: bug-infoagents@MIT.EDU
Cc: lcs@MIT.EDU
Message-Id: <CMM.0.90.4.949457523.lcs@defiant.mit.edu>
This has been shown to happen on the Athena 8.3 Sun and SGI platforms,
using Netscape 4.61 from the infoagents locker. The X window manager was twm
but that shouldn't matter.
Netscape had been configured to use an experimental streaming MP3 player
for the Electronic Reserves project. The relevant entry in my personal
.mailcap was:
audio/mpeg; /afs/athena.mit.edu/astaff/project/ereserves/bin/mpg123 -R -b 0 \
--ctl_subprocess /afs/athena.mit.edu/astaff/project/ereserves/bin/mpg123ctl -;\
stream-buffer-size=2000
Normally, when pointed at an MP3 file (from a web server that
tags the data with MIME type audio/mpeg -- e.g.
https://eres.mit.edu/eres/docs/eres/21M.292_ZIPORYN/Baris.mp3
(and this also works:)
http://www.winandware.com/files/Dogwood%20Moon%20-%20Catherine%20Sleeps.mp3
Netscape pops up a window offering the choice of downloading the file or
playing from the network stream. The bug never shows up when "download"
is chosen, and that sometimes even seems to "cure" it. If you choose
"play from network", it works the first time. Interrupting the playback
by clicking Netscape's "STOP" (traffic light) icon or doing something else
with that Navigator window will *sometimes* have this bug in the _next_
playback:
The helper app (mpg123 in this case) is started with the standard input
file descriptor, 0, not open on any file. Most apps don't check for
that condition and react as if given no data. This one has been
modified to check and write a diagnostic when standard input is closed.
Note that this is intermittent! It doesn't happen reliably after each
interrupted streaming download. Once it happens it seems to repeat on
each subsequent streaming download, however.
I've traced system calls made by Netscape and its child processes
and discovered what seems to be the bug: When creating the child for the
helper app, it forks, and in the child it closes FD 0 and then closes
the unused pipe FD's before dup'ing the ones it uses to connect the
pipe from the parent to standard input. Sometimes it either gets these
actions out of order or chooses to close the wrong descriptor, and closes
one of the pipes before dup'ing it.
The relevant part of a "normal" trace looks something like:
vfork() [in child]
close(0) -> 0
close(23) -> 0
close(25) -> 0
fcntl(24, F_DUPFD, 0) -> 0
exec(...helper app)
The buggy trace looks like:
vfork() [in child]
close(0) -> 0
close(24) -> 0
close(24) -> EBADF
fcntl(24, F_DUPFD, 0) -> EBADF
exec(...helper app)
A further complication: making the helper app is a /bin/sh script that
execs the real app hides the bug, because sh leaks a file descriptor!
By the time sh execs your app, there *is* an open file on fd 0!
Normally, without sh, the balance of opens and closes in the stub code
in Netscape and program startup is correct, so the helper app's main()
actually sees a closed standard input.
Maybe if we can get this explanation to someone within Netscape who
cares enough to look at the source they'll just say "D'OH" and see
what's wrong, it seems to be that sort of bug -- although intermittent.
thanks,
-- Larry