[125] in Public-Access_Computer_Systems_Forum

home help back first fref pref prev next nref lref last post

Describing Internet Resources Paper - PART II

daemon@ATHENA.MIT.EDU (Priscilla Caplan)
Wed Apr 29 15:30:32 1992

Date:         Wed, 29 Apr 1992 13:54:35 CDT
Reply-To: Public-Access Computer Systems Forum <PACS-L%UHUPVM1.BITNET@RICEVM1.RICE.EDU>
From: Priscilla Caplan <COTTON%HARVARDA.BITNET@RICEVM1.RICE.EDU>
To: Multiple recipients of list PACS-L <PACS-L@UHUPVM1.BITNET>

----------------------------Original message----------------------------

MARC description of electronic data resources

Clearly, at least some electronic data resources are already
accommodated in the USMARC computer files format.  This is
defined for use for "information encoded in a manner that allows
it to be processed by a computer or related machine, including
both data stored in machine-readable form and the programs used
to process the data."  The format can be used for files
containing numeric data, representational (pictorial or graphic)
data, text, and/or software.  Databases (collections of machine-
readable records) like BIOSIS Previews should fit under this
definition.  So should text files like the ascii version of RFC-
822, computer software like Xferit, and the pictorial or image
file of a journal article.

Although data resources can logically be accommodated in the
computer files format, several issues need to be addressed.
First, this type of data stretches the traditional focus on
publication and description.  The data may or may not be formally
published, or issued in any definitive form.  In many cases while
the intellectual content remains stable, the physical
representation changes from location to location (e.g. whether
the data is on disk or diskette, in ascii or EBCDIC, etc.).

Second, new types of identifying numbers may be relevant.  A
subgroup of CNI is working on document identifiers for Internet
resources, which should be accommodated in MARC when defined.

Third, new data elements may be required for encoding the
location of the resource.  A print index has a physical location
which probably consists of a holding library and call number.
For an index available online through HOLLIS or BRS, the physical
location (perhaps a storage device in a computing facility) is
irrelevant.  The system, HOLLIS or BRS itself, is the information
required to locate the item.

In USMARC, the location is encoded in the 852 field, which is
formally part of the Holdings format but may be embedded in
bibliographic records.  This field contains subfields for
location (including library, sublocation or collection, etc.),
shelving location, identifying numbers/codes, descriptors, and
notes.  Location itself is defined as the NUC symbol of the
organization holding the item or from which it is available.  To
allow the designation of electronic "locations" such as HOLLIS,
GLADIS, BRS, DIALOG, ftp sites, data archives, etc., either the
852 must be extended, or a new field (e.g. 851) defined.

Note that the properties of being online or Internet-accessible
do not actually adhere to the data resource, but rather to the
systems/services through which access to the data is offered.
Therefore the need for extensions to the computer files format to
accommodate access information may not be required if information
sufficient to identify the "holding" system/service is provided.

MARC description of online systems/services

This category includes (but is not limited to) library systems
such as HOLLIS and GLADIS, commercially available systems such as
BRS and DIALOG, campus-wide information systems, community-wide
information networks like the Cleveland Freenet, academic and
commercial ftp sites, and bulletin boards.  A good rule of thumb
for distinguishing systems/services from data resources is
whether the entity has an internet (TELNET) or dial-up address.

Online systems and services seem to fit poorly into the
bibliographic formats.  The concepts of authorship, publication,
physical description, and series do not apply.  On the other
hand, owners or sponsors, contact persons, addresses, hours of
service, and other access information are important data
elements.  Many of these data elements are defined in the
provisional USMARC format for community information, which was
formulated for the description of non-bibliographic resources
including "programs, services, organizations, agencies, single
and ongoing events, and individuals..."   Another point of
commonality is that online systems/services, like community
agencies and programs but unlike bibliographic entities, tend to
be one-of-a-kind.

Relationship between data resources and systems/services

A one-to-many relationship can exist between any given data
resource and the systems/services that offer access to it.  For
example, the Academic Index could be available through both
HOLLIS and DIALOG.  Presumably, the "bibliographic" record
describing each data resource would contain one location field
for each relevant system/service.  The location field should not
be defined to contain all information relevant to accessing the
data resource via that system/service (TELNET address, logon
instructions, etc.).  Rather, the location field should contain
enough information to direct the user to a non-bibliographic
record for the system/service.  That record in turn would contain
all the necessary information for accessing that system, getting
help, etc.

Similarly, a one-to-many relationship exists between any given
online system/service and the electronic data resources to which
it offers access.  For example, the HOLLIS system offers access
to many electronic data resources, including the Academic Index
and the union catalog of the Harvard libraries.  The
system/service record for HOLLIS should indicate the data
resources it contains.  The name of each data resource could
appear as an access point in the record for HOLLIS, either as a
subject heading or as a name added entry.  The advantage of this
is that if records for both types of entity were contained in the
same catalog or directory, then a user searching "Academic Index"
could retrieve not only the record describing the Academic Index
but also the system/service record for each of the systems
offering access to it.  Alternatively, these might be listed in a
contents note (505).

Users could be expected to find data resources through records
for systems/services, and vice versa.  For example, a user
looking for RFC-822 might make use of broad subject descriptors
in system/service records to find ftp sites likely to provide
this document.  Conversely a user finding a record for the
Academic Index in some database might then look up the record for
a "holding" system/service to obtain a telnet address and logon
instructions.

Questions and issues

This paper offers a framework for discussing electronic data
resources and online systems and services.  Even if it is
basically acceptable, however, many issues must be resolved
before a workable mapping to USMARC can take place.

In the case of data resources, the electronic form may be one of
many, and there can be many electronic forms.  The ascii text of
a document could be printed, for example, and the print then
scanned and made available as bitmapped images.  Can these be
treated as multiple versions, with one bibliographic record and
multiple holdings, or do different physical formats constitute
different bibliographic entities?

How much data about systems/services should also be carried in
location fields in data resource records?  It would be
inefficient to have to repeat all access information and
instructions redundantly in every location field, particularly as
that information can be relatively dynamic and require frequent
updating.  On the other hand, should the user be required to look
up two records to access any resource?  How do we guarantee that
he has access to both types of records whenever necessary?

Facilities available through LISTSERV software require more
thought, as do the descriptions of ftp sites. Is PACS-L best
thought of as a data resource with a "location" of LISTSERV at
UHUPVM1 or as a system/service, like a bulletin board?  Is a
named directory at an ftp site part of the location of an item?
It is also possible for the line between data resources and
systems/services to be less clear than one would like, for
example, when a library information system combines data from a
catalog database and a circulation file to display circulation
status.

A note on format integration

This document contains several references to the computer files
format.  After format integration, fields will be valid for use
in describing any item to which they apply, regardless of format.
This does not affect any of the discussion above.  However,
please note that format integration affects only the so-called
"bibliographic" formats (books, serials, maps, manuscripts,
music, computer files, and visual materials).  Non-bibliographic
formats such as authorities, holdings, and community information
will not be affected.

home help back first fref pref prev next nref lref last post