I still believe our library catalogs and “discovery” systems do not do everything they can do, specifically, I believe they can include more quantitative data/information.
With the advent of digitized materials (like the things found in the HathiTrust, institutional repositories, journal article indexes/databases, and “digital libraries”) it is possible to count and measure characteristics of individual items and then have those measurements saved in the surrogate index record. Some of the things include, but are not limited to:
* length of document in words
* 100 most frequently used words or ngrams (excluding stop words)
* 100 most frequently used parts-of-speech
* list of unique or infrequently used words
* 25 most statistically significant words or phrases
* a list of the frequent or statistically significant named entities
There are other measurements that could be taken such as the likelihood the materials was written by a man or a woman. The likelihood the document corresponds to a particular genre. The reading level of the document could be calculated and scaled against education levels. Specialized coefficients can be modeled — such a “great books” coefficient — and then applied to each item to denote how it discusses the “great ideas”. [1]
Given these sort of things in the surrogate index records, it would be possible for our catalogs and discovery systems to answer questions such as:
* find me a short, easy-to-read philosophy book
* find me a thorough, college-level biology book
* find me a book that takes place in and around Paris and from a woman’s point of view
* given this set of previously marked items, create a graph illustrating their use of pronouns
* given this set of previously marked items, create a timeline illustrating what takes place when
* given this set of previously marked times, create a world map illustrating what takes place where
With the advent of full text, our systems can to beyond find & discover and towards use & understanding.
[1] “great books” - http://bit.ly/1AC2aFd
—
Eric Lease Morgan
Received on Mon Jan 19 2015 - 11:41:57 EST