[163] in Tooltime
Re: What got done today.
daemon@ATHENA.MIT.EDU (bdrosen@MIT.EDU)
Mon Jul 15 13:03:00 1996
From: bdrosen@MIT.EDU
To: wade@MIT.EDU
Cc: tooltime@MIT.EDU
Reply-To: bdrosen@MIT.EDU
Date: Mon, 15 Jul 1996 13:02:51 EDT
wade@mit.edu writes
>>Trying to think of options, I wonder if we created a new field, e.g.
>>mit_history_search, which had the last 2,000 characters (max. number of
>>characters than can fit in a datatype of Char) of the mit_history field.
>>The mit_history_search field could be indexed and thus searchable.
>>
>>I don't know if this option is possible or if this is this worth the
>>effort? Brett/Lynne - can we write a Tcl routine to strip the last 2k
>>characters on an Update/Insert and store this in a new field -
>>mit_history_search? Are there other options?
Do we have any idea on how long the average history field is?
If so, depending on the size we could do the following:
if the average history size = 4,000
so we have two fields mit_history_search1 and mit_history_search2
which each contain half of the history fields . The we could have
a tcl routine that would break down the history field into sections
and copy the sections into the fields. However, this method would
only work sometimes. (history fields over 4,000 would be truncated
in some manner)
Using the filemaker data I came up with the following:
> wc -C fm.export.history.7.1
1285640 fm.export.history.7.1
(numbers characters)
> wc -l fm.export.history.7.1
899 fm.export.history.7.1
(number lines. also number entries)
> bc
1285640/899
1430
(average number of characters in the history field for the filemaker
data. This includes the logid which is not really part of history field)
Another possible solution to this problem:
Dump the history fields and logid to a flat file. Run an indexer
on the file to create a structure that can be searched quickly.
Have the interface to the search be a web form that gives you
the log id and matching text as answer (or message telling you
that nothing was found) . I think that I know of a fairly easy way
to implement this. This would have to be done on some sort of regular
basis.
We could even combine the two methods to get maximum availability of data.
(The first method may produce trucated results, but should be provide
up to date info. The second method will provide untruncated results,
but may be as much as a day behind)
We could also go with the keyword searcher in scopus which would
probably be less useful than the other two suggestions unless
we have a complete enough keywords list.
Brett