[921] in Commercialization & Privatization of the Internet
Re: how big is the LoC?
daemon@ATHENA.MIT.EDU (jqj@duff.uoregon.edu)
Mon Jul 8 13:30:35 1991
To: Craig Partridge <craig@sics.se>
Cc: com-priv@uu.psi.com
In-Reply-To: Message from craig@sics.se, dated
Date: Mon, 08 Jul 91 10:29:01 MDT
From: jqj@duff.uoregon.edu
Craig,
I don't think your calculations really address the question. The LoC does
not store its data as bitmaps, so why should a raster bitmap be the
appropriate way to measure the size of a graphical item in the LoC? For
example, we might do the calculation assuming some reasonable compression
algorithm. If we figure 1:2 for text and 1:20 for 8 bit images, we get 1
million graphical items taking only 4 TB, and 80 million 300KB text items
taking 12 TB. So 25 TB total might in fact be reasonable (or might not).
For some types of LoC entry, it might even be appropriate to store the
entry as some odd format (splines for handwriting immediately come to
mind), with much better than 1:20 compression compared to simple rasters.
The real problems are (1) some of the LoC's holdings are videotapes,
films, audio, or 3-dimensional, all of which take substantial amounts of
digital storage, and (2) there is a wide and unpredictable range as to
what characteristics of a holding are needed by a patron. For an example
of the latter, even at 600dpi and 32 bits your graphical images don't
include the characteristics of the paper or ink, which may be important to
some library patrons (e.g. someone trying to detect a forgery).
So, I think the best we can say about the "size" of the LoC (as it relates
to the size of the NREN pipe that would be needed to deliver it) is that
it is probably somewhere between 10 TB and 1000 TB depending at least as
much on the characteristics of the assumed patron base as on the
characteristics of the data itself. That's a pretty big range.
The U.S. patent office has, I believe, been going through computerization
of its data base in the last few years. Perhaps someone could comment on
the state of that project and the size of that (heavily graphical) data
base?
JQ Johnson
Director of Network Services Internet: jqj@oregon.uoregon.edu
University of Oregon voice: (503) 346-1746
250E Computing Center BITNET: jqj@oregon
Eugene, OR 97403-1212 fax: (503) 346-4397