Re: What's Better: Dumbed Down or Loaded with Functionality?

From: Bernhard Eversberg <ev_at_nyob> Date: Mon, 12 Jun 2006 15:43:19 +0200 To: NGC4LIB_at_listserv.nd.edu

Ross Singer wrote:
> I think there's a fallacy of a zero-sum game here.  "If we don't get
> absolutely right the first time, we've failed completely".

Did anybody say that? I certainly didn't think it, I've been a catalog
database programmer and designer myself for long enough.

Nobody, for one thing, could say what "absolutely right" would mean.
But one thing we *can* say for sure is that no catalog, however
technologically advanced, could produce a *meaningful* response
for every query thrown at it - because software has no access to
meaning, and I happen to think that many people are going to have
to learn that this is so, and what it entails. Just look at server logs
and see what they type in. What could even an experienced reference
librarian with all her natural intelligence make of some of those
inputs? And that holds not just for catalogs.

What we have, in legacy data, is *some* controlled vocabulary and *some*
transcribed data, and not a lot of both, and there's not much hope of
improving this for the majority of the stuff we have.

The controlled vocabulary serves to make the catalog reliable for
*a few* types of queries, to facilitate collocation and a useful
arrangement of results. All other queries, and those using the
transcribed data in particular, just *cannot* be as reliable as
those for the authority data. The more text matter there is per
document, of course, the more you can do by way of fuzzy
matching, word stemming, decompositioning, lexical analysis,
dictionary lookups and what have you, to produce helpful
results for a fair many queries, but not for all. But if the methods
used differ from one catalog to the next, think of what it means
for interoperability. Yet, things like these will have to be
implemented or there'll be not much chance of getting beyond
what we have now.

One question resulting from this is, of course, should authority
control as we know it, labor-intensive as it is, be abandoned in favor
of increasing the input of additional, "enriching" text matter? IOW,
are those reliable queries still needed? Cutter has been dead for over
a hundred years, after all, and both LCSH, LCC and Dewey do appear to be
aging...
In yet other words: do we still need reliable queries, and for what
types of criteria - or not? [What are the types of queries that
yield reliable results in Google? Is there demand for more?]

Regards,
B. Eversberg