[527] in Athena User Interface

home help back first fref pref prev next nref lref last post

Re: Medusa doesn't play well with AFS

daemon@ATHENA.MIT.EDU (Christopher D. Beland)
Mon Dec 18 11:07:36 2000

Message-Id: <200012181607.LAA25516@Press-Your-Luck.mit.edu>
To: aui@MIT.EDU
Date: Mon, 18 Dec 2000 11:07:30 -0500
From: "Christopher D. Beland" <beland@MIT.EDU>


Ok, so I did a little more research into how Medusa works.  

A better way to prevent it from indexing all of /afs than deleting the
cron job is to simply add the line "file:///afs" to the file:
/usr/share/medusa/file-index-stoplist

It looks like the Nautilus "Find" feature only works on files that
have been indexed by Medusa - I don't know if this is going to change.
Medusa, I must say, is pretty nifty.  It took 40 minutes to index the
3.6 gigs worth of stuff on my home machine's local disk, but made
full-text searches of my hard drive ridiculously fast.  8) I think
I'll be using it next time I want to grep my e-mail archives for
something.  The search interface is also wonderfully sophisticated.
(I wonder how novice users will deal with it.)

I'm curious how it deals with changes to files since the last full
index?  Does something check to see what files have changed since
then, or do I just lose?

Anecdotally, I find that this is something which would be quite useful
to many users.  Right now, it takes a fair amount of command-line
sophistication to find a file in your locker based on content or even
filename.  Most users don't even know it's possible.

As Rebecca mentioned, the question of using Medusa on Athena-AFS
raises some complicated design issues.  

Part of the problem I can see straight off is that Medusa is currently
architected on a one-machine, multiple-user model.  The indexing
daemon runs as root and secures the index files so that only root can
read them.  When a user makes a query, the system only reports results
from files that the user has permission to read or list, as
appropriate.  This requires it to know how UNIX file permissions work.
Currently, I don't think there is support for understanding AFS
permissions.

There are a couple of architectural options to solve this problem for
MIT, all of which have drawbacks.

 - Run an indexer for the entire athena.mit.edu cell, and allow
clients to connect to it via the network.  (We do not trust root on
local workstations.)  This would provide functionality not unlike a
web search, except somewhat more intrusive, since users often put
files in their Public and www directories without telling anyone or
any indexing services about them.  It would also probably provide for
very inefficient seraching.  Who knows how long the indexing process
would take.  This would also represent a radical change in operation
from Medusa's current architecture, and would require teaching it how
AFS permissions work.

 - Have an automatically-created index directory at the base of each
locker; configure to search for this directory when searching files in
that locker.  Problem: Will only work in lockers that don't have more
restrictive permissions set lower down, unless a fancy kludge is
constructed to make multiple indexes, depending on access rights.
(Ick!)

 - Allow users to run the indexer at some point when they are logged
in.  (Either automatically in the background after a certain interval
since the last index, or perhaps on demand.)  This solves the
permissions problem, though does introduce a greater load on the
servers if everyone is always indexing their files.  If done on
demand, users will lose the benefit of a fast search on a whim.
There's also the question of whether or not to index files in group
lockers the user accesses frequently.


If someone were to implement a way to instruct Medusa *to* index
specified directories (as opposed to the *do not* index file currently
available), we could experiement with the indexer for the test
account, running it as the test user while it is logged in.

Full-text seraches in AFS are painfully slow; it would be cool to have
a speedier method working.  On the other hand, even a slow search is a
lot better than none.  Unfortunately, due to Athena's unique
file authentication issues, it looks like we'll be without a speedy
indexed search method for a while yet.  It would still be useful to
enable a slower, non-indexed search (simple hooks to find and grep
would do) until the better solution can be deployed.  Though
considering that it's unclear as of yet whether Nautilus will even be
included in the Summer 2001 release, such intermediate solutions might
be unneccesary.


Anyway, I should implement the solution I mentioned on the usability
machines in NW42 before I forget about it.

-B.

===============================================================
Christopher Beland - http://web.mit.edu/beland/www/contact.html
MIT STS/Course 6 (EECS)   -   MIT Athena User Interface Project              
===============================================================

Add/remove yourself: http://web.mit.edu/moira
Add/remove requests: owner-LISTNAME@mit.edu
Moderated mailing lists: http://web.mit.edu/is/service/listserv.html

home help back first fref pref prev next nref lref last post