[153] in Public-Access_Computer_Systems_Forum

home help back first fref pref prev next nref lref last post

Cataloging Internet Resources

daemon@ATHENA.MIT.EDU (Public-Access Computer Systems For)
Mon May 4 14:35:23 1992

Date:         Mon, 4 May 1992 13:31:33 CDT
Reply-To: Public-Access Computer Systems Forum <PACS-L%UHUPVM1.BITNET@RICEVM1.RICE.EDU>
From: Public-Access Computer Systems Forum <LIBPACS%UHUPVM1.BITNET@RICEVM1.RICE.EDU>
To: Multiple recipients of list PACS-L <PACS-L@UHUPVM1.BITNET>

3 Messages, 191 Lines
*-----

From: HICKEY@ITHACA.BITNET
Subject: Re: A suggestion

Basic & related questions about "cataloging the Internet", worth
settling early on: Who "owns" the info on an Internet-accessible
resource? &, Who maintains this volatile data?  Long-term, we
should combine these responsibilities & place them in the hands
of whoever maintains the catalog or resource: an Internet
protocol for registering "resources", like the protocols for
Internet addresses, or for Bitnet Lists, including info on who
owns the list, how to access, etc.  This approach would allow
the maintainers to state conditions for access, & also make them
responsible for updating info which frequently changes (phone #s
etc.).  It would also allow for maintenance of Internet-wide
files of such resources, like the Bitnet List of Lists.
While this approach might be the best long-term strategy, through
the RFC process, it still leaves open questions: _what_ info to
provide (MARC format to the rescue!), how to define "resources"
for registration, or why one should bother with this process.
To settle the latter question, I agree that librarians "should"
start a database someplace: but hey, if these are MARC records,
why not use our several (inter)national databases as test-beds,
since they are set up for searching by subject & format?  And,
since OCLC is doing a test compilation of MARC-format records
for catalogs & such, maybe we should use these records when they
are available?  As a participant in this "experiment", I can supply
more info "as it becomes available".
I offer these comments to create discussion, especially among
Internet veterans, on the feasibility of an Internet protocol for
"resources" (online catalogs, campus or community info systems,
databases, bulletin boards ....).
*-----

From: emv@msen.com (Edward Vielmetti)
Subject: Re:      A suggestion

>    I have a suggestion I want to float out to the lists I read.
>It seems to me that the time has come for some body
>of individuals to load a library catalog-style database, available
>on the internet, of internet resources.

Cataloging the Internet is a Herculean task.  In particular, it's
like cleaning out the Augean stables.

>My question: could these people get
>together, get a machine, get an IP address, get library
>catalog software (or something), 'catalog' what they
>have, provide (and maintain, I know) indices for the
>database (subject, type of resources, etc), and set up a
>mechanism for cataloging future entries?

The best mechanisms for doing this that we have on the net today
don't approach the cataloging problem from the traditional cataloging
perspective.  After all the problem is not just to sift through the
CD-ROM every three weeks of discussion and the million or so systems
there are connected, but also to try to stay in touch with what's
happening with the hundreds of new resources that get added to the net
each day.  Given that perspective the CNI and OCLC efforts are at best
a very minor effort in the entire process of generating a set of cataloging
tools.

Here are some tools that go beyond the NOTIS card catalogs and
anonymous FTP style "publishing" that made up the Internet circa
1987.  Like all good Internet projects they get a lot of leverage
by extending existing efforts (not starting from scratch) and by
enabling user participation (not relying on "experts").

#1 on the list is the Internet Gopher, a system originally built to
fit the role of campus-wide information system (a la PNN or Techinfo),
and which is simple enough to use and easy enough to add resources
to that dozens of gopher servers have popped up.  Unlike most traditional
Internet client software programs that do exactly one thing, Gopher
users mix and match telnet sessions, file transfers, browsing through
text, reading Usenet newsgroups, and database searches.  Gopher data
is also very reusable -- only one person needs to figure out how to
make a link to a resources, and everyone else can reuse that information
when building their own local view of the net.

Gopher servers will run on Unix, Macs, VMS and CMS.  Clients run on
those plus MS-DOS (and Windows too, I think.)  See
        boombox.micro.umn.edu:/pub/gopher/
for more detail.

#2 is WAIS, the Wide Area Information System, a project originally
designed to let small computers do easy full-text searches on
million-dollar parallel machines, and which was useful enough in
its Unix forms that hundreds of WAIS servers have popped up.  WAIS
databases are relatively simple to construct; rather that go through
an elaborate process of database normalization or define some baroque
MARC format, WAIS servers assume that CPU and disk are cheap and build
simple keyword indexes.  By using a sort of fuzzy matching algorithm
and relevance feedback, users zero on an a few useful documents pretty
quickly.  I used the WAIS index that keeps track of the last few weeks
of PACS-L to help find materials for this posting.

WAIS software can be had from think.com:/wais/.  Clients are fairly
easy to get going, server software less so; a number of mailing
lists are archived and indexed, so that's the best place to start
for queries that will yield known results.

#3 is netnews, but that doesn't need all that much introduction - if
you can build a community that communicates effectively via e-mail
like mechanisms, netnews is a relatively cheap way to extend
connectivity all around the world.  news also works well in conjunction
with both gopher and WAIS -- you can use gopher to view the current
contents of a newsgroup, and WAIS to do quick full-text searches.

News software - you might be using it now, otherwise check with the
campus networking folks about getting a news feed.  Clients and servers
run on most systems.  For more infomation see
        rtfm.mit.edu:/pub/usenet/news.admin/

How did Hercules end up cleaning out the Augean stables?  He diverted
a river and flooded the whole place (I guess the spec didn't mention
anything about preserving the livestock :).  The Herculean task of
providing finder's aids for the net can not be done without a similar
change in vision away from the centralized all-powerful network intelligence
gathering scheme to a flood of empowered users building shared resources
and perspectives within their communities.

--
Edward Vielmetti, vice president for research, MSEN Inc. emv@msen.com
      MSEN Inc., 628 Brooks, Ann Arbor MI  48103 +1 313 741 1120
"Dogmatic attachment to the supposed merits of a particular structure
   hinders the search for an appropriate structure" -- Robert Fripp
*-----

From: Linda.Newman@UC.Edu
Subject: RE: MARC format for Internet Resources

I've been holding back on commenting on the increased number of
discussion items on PACS-L and other lists about cataloging Internet
resources, a MARC format for Internet resources, bibliographies of
Internet resources, how to cite an Electronic Journal, etc.
The latest message I've seen on PACS-L was a suggestion to put an
OPAC like database, available on the Internet, with MARC records
cataloging Internet resources.

I didn't want to start a 'flame', but now it looks like I'm going to ...

I wonder if I'm alone in thinking these efforts a bit absurd,
like someone in the early 1900s maintaining a well stocked barn,
with lots of hay, for their new-fangled automobile.  An OPAC is
a major improvement over the card catalog for indexing print resources
that must be searched for on a shelf, away from the terminal.  But
for electronic resources, I don't think the tools we have used for
providing access to print resources even begin to do the job.  Users
will not be very impressed, in my opinion, with searching an online
database for citations -- derived records -- of other online
databases or systems which they can only access by logging off and then
logging on to something else.  Furthermore, how will we create derived
records (MARC format or otherwise) which even begin to describe the
scope of the resource, and provide specific enough subject indexing
to entire Internet systems (WAIS's), discussion groups (Listservers),
and electronic journals, and in addition describe the tools and
protocols to be used at a local site in order to obtain access?

Even if one argues that a MARC format for Internet resources is not
designed to provide the basis for a user interface, but to provide
bibliographic control over what's 'out there', and to provide a record
structure for some central online list of online lists and
online resources, are we trying to control the uncontrollable, and
what would such control really do for us anyway?

Rather, it seems to me that the kind of access interface we need to
develop should provide DIRECT access to electronic resources.  It
could include derived records, but only for print resources.
It should be based on front-end assistance, with natural language query
systems, hypertext links, keyword searching of the source data, etc., on an
information system which can directly connect to any electronic
resource it knows about.  In other words, beyond WAIS to the Internet
'Scholar's' Workstation, if I may be so bold.

As a former cataloger, I appreciate the intellectual effort behind
the access systems we have put in place.  But I don't think that we
should be wasting the efforts of our profession on applying old tools,
even in an electronic format such as an OPAC, to new resources.  If
we do, our efforts will be only minimally appreciated, and we will
not be valued as we should be as experts on retrieval of
online resources.  Just because we know how to use the old tools,
and haven't quite birthed the new tools, isn't reason enough to
use the old tools where they just won't work.  (Although I think that
hay was used around the 1900's for stuffing automobile tires.)

  | Linda Newman               |    BITNET: LNEWMAN@UCBEH            |
  | University of Cincinnati   |  INTERNET: LNEWMAN@UCBEH.SAN.UC.EDU |
  | Library Systems Office     |       FAX: (513)556-2161            |
  | Cincinnati, OH 45221-0033  |     VOICE: (513)556-1441            |

home help back first fref pref prev next nref lref last post