Re: discovery systems need to do more

From: Laval Hunsucker <amoinsde_at_nyob> Date: Tue, 31 Aug 2010 15:22:16 -0700 To: NGC4LIB_at_LISTSERV.ND.EDU

> After people find they want to do a myriad of other things:
> retrieve, read, compare & contrast, annotate, analyze, learn,
> verify, rate, rank, review, share, summarize, etc. There is no
> logical reason why librarianship can not evolve towards these
> sorts of services. There may be financial and political reasons
> but not logical ones.

No, no strictly logical reasons -- but surely cognitive reasons, 
and affective reasons, and probably ethical reasons why at least 
much of this scenario could be considered not only unrealistic 
or impractical, but also undesirable ?

But I like the rest of your "replies", especially nos.4 and 5. Very 
much worth reflecting on -- and potentially doing something with 
-- imho. Thanks for those.

- Laval Hunsucker
   Breukelen, Nederland

----- Original Message ----
From: Eric Lease Morgan <emorgan_at_ND.EDU>
To: NGC4LIB_at_LISTSERV.ND.EDU
Sent: Tue, August 31, 2010 11:09:28 PM
Subject: Re: [NGC4LIB] discovery systems need to do more

> I think library "discovery systems" and/or catalogs need to do more, and here's 
>why...

I appreciate all the comments. Thank you. The posting was designed to generate 
discussion, and here are some my replies:

  1. Catalogs -- Yes, traditionally, library catalogs have been about the 
physical collections, but that was only true because collection were physical to 
begin with. Instead, I consider library catalogs -- and I mean that word exactly 
-- to be lists of resources pertinent to the information needs of our 
constituents. They represent -- and I mean that word too -- library collections. 
But I don't think library catalogs necessary need to describe things physical. 
They can also describe things electronic as well as things outside our walls.

  2. Databases versus indexes - Electronic card catalogs are essentially 
database applications. Well-designed ones are built on relational databases 
complete with normalization. Databases are very good at creating and maintaining 
information, but they s\/ck when it comes to search. This is because a person 
must know the structure of the database in order to completely exploit it. 
Indexers, on the other hand, s\/ck when it comes to creating and maintaining 
information, but they are great when it comes to search. No need to know syntax, 
and the power of statistical prediction comes into play. "Discovery systems" are 
indexers, and believe it or not, the majority of them are based on the same open 
source technology (Lucene/Solr).

  3. Find is the not problem to solve - Well-implemented indexers work very well 
satisfying the majority of people's information seeking needs. Everything else 
is time spent after reaching the point of diminishing returns. That does not 
mean there is not a whole of relevant information for people to sift through. 
What is needed, IMHO, are more tools enabling people to evaluate, use, and 
understand the information they find. Find is only the first and not most 
important part of their process. After people find they want to do a myriad of 
other things: retrieve, read, compare & contrast, annotate, analyze, learn, 
verify, rate, rank, review, share, summarize, etc. There is no logical reason 
why librarianship can not evolve towards these sorts of services. There may be 
financial and political reasons but not logical ones.

  4. Functionality - The functionality of digital content compliments the 
functionality of analog content. We need to take advantage of this. Because 
digital content is easily "read" by computers, it lends itself to a host of 
different ways to interpret it. For example, take size. Telling me a book is 108 
pages long tells me very little, but telling me a book is 249,987 words long 
tells me something less ambiguous. (James Joyce's Ulysses is about 280,000 words 
long where as Machiavelli's The Prince is about 10 times longer at 30,731 words 
long.) Here's another example. A statistical measure called "vocabulary density" 
is the ratio of unique words in a text compared to the total number of words in 
a text. In general, texts with higher densities are more difficult to read. Many 
other types of statistical analyses can be done if and only if the content 
exists in digital form. If these sorts of numeric characteristics where included 
in the descriptions of our collections, then w!
e could provide faceted browsing against them. People could then compare & 
contrast works quantitatively as well as qualitatively. "I want to find books on 
philosophy that are short but have a high vocabulary density and written before 
1800." Library catalogs and "discovery systems" do not provide this sort of 
functionality.

  5. Harvesting - Libraries are about collection AND services. You need both to 
call yourself a library. [*] Libraries can enhance their collections as well as 
their services if they were to acquire -- not license -- digital materials. This 
can easily be done through harvesting. Here's one way: 1) dump all your MARC 
records to a file, 2) extract the authors, titles, and other identifying pieces 
of information, 3) programmatically search for these items in open repositories, 
4) once found, mirror them to a local HTTP server, 5) update the MARC records 
with the local URL as well as the remote URL, and 6) provide cool services 
against the electronic content that is not possible with analog content such as 
the things outlined in Items #3 and #4, above. Repeat this process for other 
types of content -- the content fitting your local collection policy.

Us librarians are not exploiting the technology nor the environment. We are 
reactive instead of proactive. We automate processes instead of figuring out how 
to truly take advantage our tools' capabilities. I think we are barking up the 
wrong trees trying to solve problems most people think are increasingly 
irrelevant. I sort of feel like we, the library profession, are trying to 
improve the telegraph when everybody else has moved on to cell phones.

[*] A Buddhist monk once said, "Collections without services are useless, and 
services without collections are empty."

-- 
Eric Lease Morgan
University of Notre Dame