Re: opac live search

From: Weinheimer Jim <j.weinheimer_at_nyob>
Date: Wed, 4 Mar 2009 11:30:46 +0100
To: NGC4LIB_at_LISTSERV.ND.EDU
Alexander Johannesen wrote:

> To me, there is nothing wrong with selection, as long as I have the
> option to unselect. Just like Tim's argument, we need all data and
> then have tools that help us sift through it. That tool in the past
> *were* the librarian, but the amounts of information makes that
> approach unmaintainable and undoable. Even small constrained domains
> are overflowing with information these days, and who are the librarian
> to decide what is the right kind of information for others?

This misses the point of library selection: we are not there deciding what is "right," but using our tools made in close contact with the administration and faculty (and we hope users), we decide on the materials that would be most useful. Different collections with different purposes will select different things. And there are some collections that are termed "exhaustive" where they try to get more or less "everything" in a certain area more or less without selection (but you note that even this word is in quotation marks).

To use an exhaustive collection demands a lot from a user because there is so much junk to sift through since it is so incredibly complex. Again, one man's junk is another man's treasure, but it still must be organized in a coherent way to save people's time, otherwise nobody could ever do anything with it.

Librarians select materials on the basis of utility by the users of the collection and on local resources such as space and funding. It is unethical for a librarian to create a collection that simply mirrors his or her own personal prejudices and biases. This is quite a different task from the teaching faculty. If I don't have something in my collection and someone asks for it, I must help that person get the information, no matter if I agree with what they are doing or not.

Google attempts to be exhaustive in its own domain (no selection), although there is still the "hidden web" so lots of materials are missing. Its method of "organization" (a type of selection by ordering the links) is completely secret, this arrangement of links can and has been manipulated by all kinds of people for various purposes, and this arrangement cannot be changed by the user except to come up with additional words to use, limit to a specific domain or format, and so on. 

While I am not maintaining that Google is bad--it is what it is, Google has very serious weaknesses that must be acknowledged. This is the only way forward. While these limitations are difficult to see in regular Google for various reasons, these limitations become crystal clear to all in Google Book Search.

> > How can this "control zone" be created? I think that
> >  (apologies to Tim) the Semantic Web has the tools
> > and methods to create it.
> 
> Both yes and no. There's tools out there and the data could be made
> available, but the lack of consistency and the ambiguous levels of
> ontologies on the web makes this approach currently very, very hard.
> RDF, RDFs and OWL are severely lacking in good identity control,
> something rather basic to anything academic, so consider me sceptical,
> at least at this point.
> 
> Apart from that, I find Tim's response fascinating and rather smack
> bang on. The thread we're talking about is the OPAC, and in this
> context the OPAC as a tool for scholars, so finding an OPAC that's
> actually suited to the *scholars* needs (and not the librarian) is
> sought. Any takers?

Although I may be wrong, my reading of Tim's comments is that Google is so good while the OPAC is so bad, that the OPAC is unimportant for scholarly research. I pointed out some really glaring inadequacies in Google Book Search where their algorithm simply disintegrated, and I haven't mentioned the arrangement of the resources after a search, which is completely incoherent to me (why is this book number 1?). And this is when I am doing my own research (which I do, too). Consequently, I don't believe Google has organized GBS since the results are completely disorganized.

I don't say that the resu
lts are useless or bad, but they are clearly not organized. It is only natural to wonder if the same limitations are in the regular Google search as well but they are more difficult to see.

This is why I maintain that we need both methods, although as you point out, it could easily work in one system so that people could search all at once.

Jim Weinheimer
Received on Wed Mar 04 2009 - 05:36:54 EST