Re: Relevance ranking: was Aqua Brow

From: Tim Spalding <tim_at_nyob>
Date: Sat, 5 Jan 2008 17:01:45 -0500
To: NGC4LIB_at_listserv.nd.edu
We end at the same place, but I'm not sure I agree with Casey's denial
of the existence of "concepts."

"It's all just strings" may be true from the programming standpoint,
but I think we can talk about concepts and classification as something
potentially more than string matching. That is, when I refer to Casey
as a "human" and Lassie as a "dog" I am doing something more than
setting up a strings and pointing them around. You don't need to be a
Platonist or an Aristotelian to believe in the possibility of meaning,
either abstractly or with reference to individual interpreters.
Meaning can be at least partially communicated to paper or to
databases, and so it can be searched.

Indeed, if we take meaning seriously we can see the flaws in something
like LCSH all the more clearly. Whatever it's intent, LCSH, at least
as it's deployed today, doesn't connect books to meaning very well at
all.

Some examples:

*Are books "about" 3-6 things? Of course not. They have 3-6 LCSHs
because that's how many fit on an index card. In reality, a book is
about a lot of things.

*Are books "about" everything to the same degree? Of course not. Some
books are really and truly about WWI, and some are about it to some
lesser extent or with some qualification or debate. The world of
meaning is nonsense without degree, debate, audience and nuance. Only
in LCSH is meaning so cut-and-dried.

*In a similar vein, are there really just 1,253 things in the Library
of Congress about "Man-Woman Relationships"? I would think 95% of
western literature is about that topic, just not in the same way or
degree.

*Does the aboutness of a book stay fixed from the day it's cataloged?
Is Bridget Jones's Diary not "chick lit" because it was published
before the term existed, and a decade before LCSH adopted it? Are
early editions of Anne Frank's diary not about the Holocaust? Of
course not. These are facts about ink on paper cards, not about
meaning.

*Are librarians the only arbiters of meanings? Obviously not. In a
physical library it makes sense to control things; it would be anarchy
if every professor and student set about penciling in corrections and
adding cards. In the digital world, non-librarians can be part of the
process too--adding tags and other systems--and if people want to
ignore what they have to say, that's fine too.

I don't want to give up on meaning. Anyone who thinks that LCSH is
usually the best way for patrons to find things ought to receive a
classification from the DSM-V. (And for those who don't believe in the
idea of relevancy, my next email will be about the sex life of fish,
in German.)

But LCSH is an okay way. It belongs in the toolkit. It approaches the
issue from a different angle, and, with all its flaws, often produces
results that keyword searching (or tags) cannot. I wish someone would
make it better.

But meaning is *hard*. I've never heard anyone so much as whisper
about making changes to LCSH to remove the constraints its physical
origin have given it. Ditto every other "meaning" system. I have no
hope for the "Semantic Web," for much the same reasons. Meanwhile,
breakthroughs are coming elsewhere--in full text, in tags, in lists
assembled outside of libraries.

On all other points, I agree with Casey. Except I'm not quite as
depressed. Whether tags mean anything, LibraryThing is about to hit 30
million tags, and my new tag recommendation algorithm found me a slew
of books on cataloging and classification I haven't read yet.

Best,
Tim
Received on Sat Jan 05 2008 - 17:02:35 EST