content beyond books

From: Eric Lease Morgan <emorgan_at_nyob>
Date: Wed, 22 Aug 2007 11:35:38 -0400
To: NGC4LIB_at_listserv.nd.edu
I assert that library "catalogs" need to include content beyond
books. [1]

More specifically, I assert library "catalogs" need to include
article-level content. Apparently in the early 20th century some
library catalogs did contain article-level content, but maintaining
all of those cards for all of those articles was just not scalable.
This was an opening for H.W. Wilson, his various indexes, and an
example of library outsourcing. [2]

Article-level content abounds on the Internet. Through the use of
things like but not limited to OAI-PMH it is possible to harvest and
include into your "catalog" article-level data (and even the articles
themselves). OAISter is probably the biggest example. It is an index
to about 10,000,000 items accessible via OAI including articles,
theses & dissertations, images, manuscripts, etc.:

   http://www.oaister.org/

I have created two similar indexes, both using MyLibrary. [3] The
older one harvested content from NSF Digital Library repositories. It
includes about 430,000 pointers, and its content was (automatically)
"cataloged" and enhanced. Its user interface sports spelling
correction and some thesaurus assistance:

   http://mylibrary.ockham.org/
   http://www.dlib.org/dlib/october05/morgan/10morgan.html

I created a much simpler interface using content from the Directory
of Open Access Journals. This index only indexes 54,000 items, but it
includes an authority list of publishers and sources (journals). [4]
The primary purpose of this implementation is to demonstrate the
possibilities of MyLibrary. It is not a production service:

   http://dewey.library.nd.edu/mylibrary/demos/article-index/

It is not a very large leap to add the metadata describing the
articles into a library "catalog". Authors. Journal title. Article
title. Notes. Location (URL). Subject headings. Why not?

People don't care what format information is in (as long as it is not
microfiche), and information silos are not the way to go. Silos are
frustrating. One for (primarily) books. Many for articles. One
representing institutional repository content. One for archives. One
for special collections. One for Internet resources. Etc. Metasearch
is not going to cut it. It is a nice try, but it is not able to
fulfill its promise since relationships between content in its
disparate resources is not strong. Different vocabularies. Different
fields. Different ranking and sorting algorithms. First name last.
Last name first. On and on.

Bringing all content together, as much as possible, into a single
index, will make everybody's life a lot easier. If this is to be
done, then the definition of the "catalog" needs to be broadened, and
a wider number of people need to be brought into the discussion, such
as collection managers and bibliographers. What is the scope of a
library "catalog"? To what degree is it an inventory list, and to
what degree is it a list of content designed to meet the needs of its
clientele combined with sets of services applied against that content?


Notes

[1] Yes, I know. Library catalogs do contain information describing
things other than books, but you know as well as I do that the world
of information has grown to well beyond codexes to include mailing
lists, data sets, computer programs, images, sounds, spreadsheets,
PDF documents, etc. Yet still, library catalogs primarily include
descriptions of books.

[2] This also an example of how we created information "silos" in our
libraries. Lorcan Demsey describes this predicament in greater detail
in one of his blog postings. See: http://orweblog.oclc.org/archives/
001379.html

[3] http://mylibrary.library.nd.edu or http://dewey.library.nd.edu/
mylibrary/

[4] No, it does not include a name authority list, but it could. I
just didn't implement that part.


--
Eric Lease Morgan
Univesity Libraries of Notre Dame
Received on Wed Aug 22 2007 - 09:08:27 EDT