[761] in WWW Security List Archive

home help back first fref pref prev next nref lref last post

glimpseHTTP and httpd access control

daemon@ATHENA.MIT.EDU (Prentiss Riddle)
Fri Jul 7 14:57:53 1995

From: riddle@is.rice.edu (Prentiss Riddle)
To: www-security@ns2.rutgers.edu
Date: Fri, 7 Jul 1995 10:06:37 -0500 (CDT)
Errors-To: owner-www-security@ns2.rutgers.edu

[I haven't received any response from the Glimpse mailing list, so I
thought I'd try this here as well.  Consider this a specific case of a
general problem:  how to provide a search engine for a web hierarchy
without violating httpd's access control.  For more information on Glimpse
and glimpseHTTP see: http://glimpse.cs.arizona.edu:1994/glimpsehttp.html ]


I am interested in using Glimpse and glimpseHTTP to provide a general
search index of much of the information on my web server.  However, one
issue disturbs me: it seems to me that glimpseHTTP (like many other
gateways) could be used to get around access control and cause
restricted data to "leak" off of our server.

The details: NCSA httpd provides a familiar mechanism for access
control based on the client's IP address.  Users may create ".htaccess"
files in any directories in the web data hierarchy in order to limit
access to clients coming from a particular domain (e.g., to restrict
certain portions of the data tree to the campus LAN.)  However, the
"cgi-bin/mfs" gateway and the "wwwlib/getfile" utility used in
glimpseHTTP do not appear to enforce the restrictions imposed in
".htaccess" files.  Thus an off-campus user who makes a Glimpse search
via glimpseHTTP could retrieve copies of files which were supposed to
be distributed only within the campus.

I can think of two approaches to solving this problem:

   (1) Modify "cgi-bin/aglimpse" so that it eliminates the use of
   "cgi-bin/mfs" and instead refers the user to the URL of the original
   document.  Without "cgi-bin/mfs" as an intermediary, httpd's own
   native access control would apply.  (A drawback to this approach is
   that it would make it impossible to create URLs referring to
   specific lines within the document.)

   (2) Modify "wwwlib/getfile" so that it parses and obeys access
   control restrictions within ".htaccess" files, based on the IP
   address of the client.  This still might run afoul of other access
   control restrictions (e.g. per-user access) in ".htaccess" files.

However, both of these approaches still share a drawback: even the
Glimpse search report, which provides only filenames and the excerpted
lines which match a search, could be used to probe restricted areas
within a WWW data tree.  Even the leak of a fragment of a restricted
document might be considered a serious security problem.

Has anyone come up with a solution to this problem?  Or do glimpseHTTP
users content themselves with only serving out indexes of unrestricted
web documents?

-- Prentiss Riddle ("aprendiz de todo, maestro de nada") riddle@rice.edu
-- RiceInfo Administrator, Rice University / http://is.rice.edu/~riddle
-- Home office: 2002-A Guadalupe St. #285, Austin, TX 78705 / 512-323-0708

home help back first fref pref prev next nref lref last post