[7450] in www-talk@info.cern.ch
Re: Client-side searching proposal
daemon@ATHENA.MIT.EDU (Gary Adams - Sun Microsystems Labs)
Tue Jan 31 10:21:28 1995
Date: Tue, 31 Jan 1995 15:40:00 +0100
Errors-To: listmaster@www0.cern.ch
Reply-To: Gary.Adams@east.sun.com
From: Gary.Adams@east.sun.com (Gary Adams - Sun Microsystems Labs BOS)
To: Multiple recipients of list <www-talk@www0.cern.ch>
> From gtn@ebt.com Tue Jan 31 09:03:59 1995
> Date: Tue, 31 Jan 1995 09:03:51 -0500
> From: Gavin Nicol <gtn@ebt.com>
> To: Gary.Adams@East
> Cc: www-talk@www0.cern.ch
> Subject: Re: Client-side searching proposal
> Content-Length: 704
> X-Lines: 14
>
> >There are standard SQL exetensions that provide for a limited amount of
> >full/text query specification. The formulation of the query is only half of
> >the problem. The more difficult part of the problem (in my opinion) is how do
> >you handle the "sub-document" addressability for the relevant fragments of
> >the document to be retrieved or to be highlighted. Traditional database
>
> Sub-document addressing is not a hard problem for SGML documents. Have
> a look at the TEI schemes, or perhaps the HyTime schemes.
After a document is frozen on a CDROM, can I go back and impose new
addressing schemes beyond the original named element of the document?
e.g. the third sentence of the Constitution.
>
> Can someone provide a URL for the online TEI specs?
>
The home page for TEI is <A HREF="http://etext.virginia.edu/TEI.html">
http://etext.virginia.edu/TEI.html</A>
> Also, have a look at <URL:http://www.ebt.com/> for a server that has
> sub-document adressing capabilities based on the SGML tree structure.
>
>
The full SGML system at EBT addresses the need for authored structural addressability.
The type of subdocument addressability that I am looking for would allow a search
engine to refer to the last paragraph in chapter 2 spanning to the first paragraph
of chapter 4 (potentially spanning 3 html files) as a region of information to be
presented to a user which satisfies a complex "how to" query.
I'd also like to construct complex standing queries about "the president of the
United States" in the news, which returns a conditional result. The selection
mechanism for a search engine can be distinct from both the scoring and highlighting
mechanisms. An older document might incorrectly highlight the word "Clinton" if he
was "govenor Clinton" at the time.