Re: Content [looking for a search engine]

From: Eric Lease Morgan <emorgan_at_nyob> Date: Fri, 9 Jun 2006 12:01:36 -0400 To: NGC4LIB_at_listserv.nd.edu

On Jun 9, 2006, at 7:49 AM, Steven Carr wrote:

> As far as the catalog fits in...Given this extremely broad (and
> broadening) picture of who the users are (and what languages they
> speak), is it a "catalog" we are looking for, or a search engine
> that does what the catalog has traditionally done, plus the ability
> to do a whole lot more?

Yes, I believe we are looking for a search engine -- index -- that
does what the catalog has traditionally done, plus the ability to do
a whole lot more.

Given the information environment we are increasingly living in, I
suspect most libraries are not really looking for a catalog as much
as they are looking for some other sort of index. While a catalog is
a necessary library tool, it does not contain the content nor the
functionality people increasingly expect. For the most part, library
catalogs are inventory lists -- specialized indexes to the things a
library owns (or licenses). In an environment of digital and
networked information people don't feel limited to the content within
a specific space. Nor do they want to go through the process to
retrieve a book.

A patron might ask this sort of question, "I have a family member who
has developed cancer. What content can you give me describing the
effects of chemotherapy?" The answer may involve content that is not
located in the library, but that does not mean a library can not
provide such information. It is quite possible for libraries to
collect (read "harvest" or "mirror" and then index) content regarding
the cancer or other topics and include it in a tool used to address
this need.

I don't advocate throwing away the catalog. Instead I advocate
creating an additional index made up of traditional catalog content
supplemented with other content relevant to the needs of a particular
library's users. This supplemental content could be metadata, such as
the data from EAD files or CIMI files. It could also include
descriptions of journal articles.

Moreover, I would advocate this "Über Index" contain as much full
text as possible, not just pointers. This full text would include
ebooks, articles from open access literature, theses & dissertations
if they were appropriate, (Wikipedia) encyclopedia articles, the
definitions of words, biographies, images, data-sets, etc. I would
add these full text things to save the time of the reader and reduce
the number of links they must go through to acquire their
information. I do not advocate meta-search to facilitate this index
because it is not practical, holistic, nor scalable. While Z39.50 can
search multiple targets, it can not do this well because of different
indexing schemes and the existence of duplicate records. Did you say,
"Screen scraping?" Ick. Furthermore, not only would I add this
additional content, but I would also provide additional services
against the content such as but not limited to: download, review,
annotate, suggest, get email address of author of, save, trace idea
backwards & forwards, find more like this, purchase, authenticate,
print, share, tag, compare to other selected items, summarize, create
citation for, email, etc. After you find the thing you want to get
it, and once you get the thing you want to do something with it.

By creating such collection/service combinations libraries would
build on their ability to collect and filter information for specific
audiences and distinguish themselves from the more general Internet
indexes.

--
Eric Lease Morgan
Head, Digital Access and Information Architecture Department
University Libraries of Notre Dame

(574) 631-8604

I'm hiring a Senior Programmer Analyst.
See http://dewey.library.nd.edu/morgan/programmer/.