[175] in ad-lib

home help back first fref pref prev next nref lref last post

019 indexing (alas, somewhat long)

daemon@ATHENA.MIT.EDU (Tom Owens)
Thu Mar 16 15:40:11 1995

To: ad-cat@MIT.EDU
Date: Thu, 16 Mar 1995 15:39:58 EST
From: Tom Owens <owens@MIT.EDU>

I'd like to raise some issues regarding indexing the 019 as part of the OCLC
index.

As you know, from time to time, OCLC merges two or more
previous records into one.  This means some OCLC numbers are obsolete
and the obsolete numbers are placed in multiple subfields in the 019.

This causes some problems when loading records because we may have one 
of the obsolete records in our database.  A tape/ftp loaded record's 001 
will not match the datbase record's 001 and a duplicate record will be 
created.  

The _standard solution_ to this problem is _not_ to index the 019.  It is,
rather, to use this algorithm when matching records:

	does the incoming record 001 match an existant record 001
	if so -- overwrite the existant record
	if not --
		does any incoming record 019 subfield match an existant
		  record 001 ?
		if so -- overwrite the existant record
		if not -- create a new record.

As you see, the problem is to check every 019 in the incoming record (which is
not yet indexed) not to check existant 019s in the database.

Given that indexing 019s (which we do not currently do) does not solve
the loading problems, is there any other reason not to index 019s?

Theoretically, yes, but not necessarily.  Generally, in a matching index 
every key should have one and only one associated record.  In addition,
every record should have one and only one associated key.  This makes duplicate
detection easier for the loader, batch programs, and human beings.  I have no
idea what will happen in Advance if we don't follow the one record/one key
rule.  In current GLIS, it would cause some programs to fail to work properly.

So, with no evidence to the contrary, I would argue for making sure the 
loader looks at incoming 019s and that we not index the 019.  If we decide we 
need the 019 indexed for other reasons, we should look carefully at the
implications of that decision, which I could not predict.


--	
Tom Owens
MIT Library Systems Office
owens@mit.edu
617-253-1618 voice 617-253-8894 fax


home help back first fref pref prev next nref lref last post