[13838] in bugtraq
Re: CGI.pm and the untrusted-URL problem
daemon@ATHENA.MIT.EDU (Kragen Sitaker)
Tue Feb 15 15:11:09 2000
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Message-Id: <Pine.GSO.4.21.0002141532350.24442-100000@kirk.dnaco.net>
Date: Mon, 14 Feb 2000 15:48:33 -0500
Reply-To: Kragen Sitaker <kragen@POBOX.COM>
From: Kragen Sitaker <kragen@POBOX.COM>
X-To: marcs@znep.com, bugtraq@securityfocus.org, lstein@cshl.org,
ken.coar@golux.com
To: BUGTRAQ@SECURITYFOCUS.COM
support@dnaco.net removed from addressee list because they probably
don't want to hear the whole conversation; I just want them to fix our
local CGI.pm so my web pages are safe. :)
Marc Slemko writes:
> On Mon, 14 Feb 2000, Kragen Sitaker wrote:
> > Diagnosis
> > ---------
> >
> > It appears that this happens because the unencoded space is interpreted
> > by the HTTP server (Apache 1.3.6 in my tests) as separating the URL
> > from the protocol name. So the environment variable SERVER_PROTOCOL
> > gets set to everything following the space, followed by a space and the
> > actual protocol, such as "HTTP/1.0".
>
> Correct, this does appear to be a bug. I suspect that a lot of such bugs
> will be found. Unfortunately.
>
> However it is important to note that this does not exploit a bug in
> Apache. Apache is choosing to deal with an illegal request in a perfectly
> legitimate manner. At least, that is my understanding of what the spec
> says; I haven't checked it closely WRT this particular issue.
I think you're right.
> Part of Apache's functionality is to pass unknown methods and protocols on
> to CGIs. It is be arguable that Apache should explicitly reject any
> request with more than two unencoded spaces in it.
Well, unknown methods I certainly agree with; but if the protocol is
completely unknown --- not even a version of HTTP --- how can Apache
reasonably think it knows what part of the request constitutes the URL,
or when it has reached the end of the request?
Apache, in this case, constitutes the interface between mutually
untrusted contexts: a Web browser and a CGI script. (And, as CERT
points out, there's a third context involved, trusted by neither of the
other two --- the URL provider.) As I see it, part of its purpose in
life is to restrict the information passed between these contexts to a
known and unsurprising set of channels.
> > RFC 1738 and RFC 2068 say that only a-z, 0-9, "+", ".",
> > and "-" are allowed in scheme names. Accordingly, I suggest the
> > following change to CGI.pm:
>
> Or it could simple properly encode things, as it should do for all data
> supplied by the user that is output.
>
> Filtering is often easier, however, as encoding can be very context
> sensitive.
I'm not sure what the proper encoding for scheme names would be. :)
self_url does appear to properly encode malicious data inserted in
other parts of the input URL.
> > The successful exploit requires a remarkable chain of extreme forgiveness:
> > 1- The web browser must accept an illegal URL from (possibly valid,
> > although very unusual) HTML.
> > 2- The web browser must send an illegal HTTP request with the illegal
> > URL, without %-encoding the URL to make it legal.
>
> Note that IE appears to be far better in making sure it only makes legal
> requests. Good job Microsoft, in this particular situation.
What version of IE is better in this way? MSIE 3.0 is just as lenient
as Netscape 4.6 in this sitation. I don't have any machines with MSIE
4 installed, because MSIE 4 makes me uncomfortable.
--
<kragen@pobox.com> Kragen Sitaker <http://www.pobox.com/~kragen/>
The Internet stock bubble didn't burst on 1999-11-08. Hurrah!
<URL:http://www.pobox.com/~kragen/bubble.html>
The power didn't go out on 2000-01-01 either. :)