> I think library "discovery systems" and/or catalogs need to do more, and here's why...
I appreciate all the comments. Thank you. The posting was designed to generate discussion, and here are some my replies:
1. Catalogs -- Yes, traditionally, library catalogs have been about the physical collections, but that was only true because collection were physical to begin with. Instead, I consider library catalogs -- and I mean that word exactly -- to be lists of resources pertinent to the information needs of our constituents. They represent -- and I mean that word too -- library collections. But I don't think library catalogs necessary need to describe things physical. They can also describe things electronic as well as things outside our walls.
2. Databases versus indexes - Electronic card catalogs are essentially database applications. Well-designed ones are built on relational databases complete with normalization. Databases are very good at creating and maintaining information, but they s\/ck when it comes to search. This is because a person must know the structure of the database in order to completely exploit it. Indexers, on the other hand, s\/ck when it comes to creating and maintaining information, but they are great when it comes to search. No need to know syntax, and the power of statistical prediction comes into play. "Discovery systems" are indexers, and believe it or not, the majority of them are based on the same open source technology (Lucene/Solr).
3. Find is the not problem to solve - Well-implemented indexers work very well satisfying the majority of people's information seeking needs. Everything else is time spent after reaching the point of diminishing returns. That does not mean there is not a whole of relevant information for people to sift through. What is needed, IMHO, are more tools enabling people to evaluate, use, and understand the information they find. Find is only the first and not most important part of their process. After people find they want to do a myriad of other things: retrieve, read, compare & contrast, annotate, analyze, learn, verify, rate, rank, review, share, summarize, etc. There is no logical reason why librarianship can not evolve towards these sorts of services. There may be financial and political reasons but not logical ones.
4. Functionality - The functionality of digital content compliments the functionality of analog content. We need to take advantage of this. Because digital content is easily "read" by computers, it lends itself to a host of different ways to interpret it. For example, take size. Telling me a book is 108 pages long tells me very little, but telling me a book is 249,987 words long tells me something less ambiguous. (James Joyce's Ulysses is about 280,000 words long where as Machiavelli's The Prince is about 10 times longer at 30,731 words long.) Here's another example. A statistical measure called "vocabulary density" is the ratio of unique words in a text compared to the total number of words in a text. In general, texts with higher densities are more difficult to read. Many other types of statistical analyses can be done if and only if the content exists in digital form. If these sorts of numeric characteristics where included in the descriptions of our collections, then w!
e could provide faceted browsing against them. People could then compare & contrast works quantitatively as well as qualitatively. "I want to find books on philosophy that are short but have a high vocabulary density and written before 1800." Library catalogs and "discovery systems" do not provide this sort of functionality.
5. Harvesting - Libraries are about collection AND services. You need both to call yourself a library. [*] Libraries can enhance their collections as well as their services if they were to acquire -- not license -- digital materials. This can easily be done through harvesting. Here's one way: 1) dump all your MARC records to a file, 2) extract the authors, titles, and other identifying pieces of information, 3) programmatically search for these items in open repositories, 4) once found, mirror them to a local HTTP server, 5) update the MARC records with the local URL as well as the remote URL, and 6) provide cool services against the electronic content that is not possible with analog content such as the things outlined in Items #3 and #4, above. Repeat this process for other types of content -- the content fitting your local collection policy.
Us librarians are not exploiting the technology nor the environment. We are reactive instead of proactive. We automate processes instead of figuring out how to truly take advantage our tools' capabilities. I think we are barking up the wrong trees trying to solve problems most people think are increasingly irrelevant. I sort of feel like we, the library profession, are trying to improve the telegraph when everybody else has moved on to cell phones.
[*] A Buddhist monk once said, "Collections without services are useless, and services without collections are empty."
--
Eric Lease Morgan
University of Notre Dame
Received on Tue Aug 31 2010 - 17:11:50 EDT