[3878] in WWW Security List Archive

home help back first fref pref prev next nref lref last post

Re: maintaining state and security

daemon@ATHENA.MIT.EDU (jwp@chem.ucsd.edu)
Sat Dec 21 20:40:37 1996

From: jwp@chem.ucsd.edu
Date: Sat, 21 Dec 1996 16:10:51 -0800
To: www-security@ns2.rutgers.edu
Errors-To: owner-www-security@ns2.rutgers.edu

 > I have a rather large site on which I maintain state using CGI scripts ...
 > [with session IDs. How do I cope with search engines? (paraphrased from the
 > original)]

Many (but not all) search engines identify themselves via USER_AGENT.
Somehwere on the web is a list of the known ones. I don't have the URL,
I just search for it when I need it. Look around for references to the
file that is used to keep (polite) search engines out of a site; I think
that's where I last saw the list.

Given that list, your CGI scripts could look at USER_AGENT to allow
(polite) search engines to do their indexing without a session ID.
Since your scripts clearly must deal with the "no session ID" case
already, this shouldn't be too hard.

Since most of the popular index sites have fairly well behaved search
engines, this should solve most of your problems. Ill-behaved engines
will be difficult to handle, and fairly short timeouts on session IDs
will solve some. Some you will probably just have to live with.

-- John W Pierce, Chem & Biochem, UC San Diego
   jwp@ucsd.edu

home help back first fref pref prev next nref lref last post