Re: Book tagging: Amazon and LibraryThing

From: Binkley, Peter <Peter.Binkley_at_nyob> Date: Mon, 26 Feb 2007 10:00:23 -0700 To: NGC4LIB_at_listserv.nd.edu

The comments (mine included) on William Denton's fascinating recent
posting on FRBR and Copernicus
(http://www.frbr.org/2007/02/22/de-revolutionibus#comment-83380) touch
on this.  There is no clear boundary between bibliography as the
workaday business of managers of collections and bibliography as a
scholarly discipline. A post-FRBR world might look something like what
William describes in a follow-up posting, discussing Gingerich's printed
Census of Copernicus copies
(http://www.frbr.org/2007/02/26/de-revolutionibus-redux):

"Right now it's all in Gingerich's Census, which I say again is an
astounding piece of scholarship, but is trapped on paper. If its
contents were also available in RDF, for example, related to controlled
vocabularies or ontologies about people and publishing and astronomy and
countries and provenance and the history of science ... Well, imagine
the possibilities."

Until recently the bibliographic work of cataloguers has been trapped in
the catalogue, but we can start to see ways of opening it up to
extension to address some of the issues Tim raises.

Peter

-----Original Message-----
From: Next generation catalogs for libraries
[mailto:NGC4LIB_at_listserv.nd.edu] On Behalf Of Tim Spalding
Sent: Monday, February 26, 2007 8:52 AM
To: NGC4LIB_at_listserv.nd.edu
Subject: Re: [NGC4LIB] Book tagging: Amazon and LibraryThing

So, this is a tempest in a tea-pot, but let me explain my point before
you all think I'm mad.

First, FRBR is a very useful model. Some abstractions are useful, and
FRBR-and other "work" models, such as LibraryThing's-are useful.

My point is this. FRBR is a binary model, defining largely binary,
clear-cut relationships between things. In the real world, relationships
are often more complex. And the relationships we want to see are as much
about what we're looking for as about what "really"
is. This becomes much clearer when you think of book data as larger than
librarianship. A collector, for example, will organize works and
manifestations very differently from a librarian.

David Weinberger did an excellent illustration of this phenomenon,
"Miscellaneous Hamlet" on his blog (
http://www.hyperorg.com/blogger/mtarchive/miscellaneous_hamlet.html ).
As he notes, there is no "real" Hamlet. Every edition is a compromise
between three imperfect sources. Every edition is an conversation
between the editor and the sources. Much the same is true of most of
Greek and Latin literature. How many texts are in a text, the
relationships between texts, the authors of text are essentially
*conversations*. When you scratch the surface, the easy verities of
"work" vanish.

Another good illustration is what goes on on LibraryThing. People argue
about what works should be combined all the time. In part,
LibraryThing's model is more simple than FRBR. But it also happens
because there is no final answer. The way we lump and split is as much
about us and what we're trying to do as it is about the world. (Note:
LibraryThing still has a binary model of relationships; I don't have the
answer!)

FRBR is a great model. If every library catalog were FRBRized, the world
would be a better place. It makes a lot of sense. But the model is too
often taken as the goal. FRBR is an advance, but there may well be
better ways-more wholeheartedly digital ways-of allowing users to
navigate the relationships that matter to them.

The situation is not unlike subject headings, another binary system of
clear-cut relationships rooted in the physical world. Subject systems
like LCSH are a big improvement over their alternative-big piles of
books or a single-topic shelving system. But a system of buckets has
inherent limitations. In the real world, "aboutness" is at least a
percentage, not a boolean. There are no perfect answers, but again and
again, systems that embrace this complexity (eg., Google) are producing
better results than ones that don't (the Yahoo directory).

I'm not sure what the answer is, but someone needs to try a "fuzzy
FRBR." Even if the set remains the same, there are weaker and stronger
relationships within it. Something like LibraryThing's thingISBN or
OCLC's xISBN service is already inherently statistical-complex,
forgiving pattern matching on huge sets of data. As David Weinberger
writes, xISBN and thingISBN have an "acknowledged degree of fuzziness."
But we lie. We hide the fuzziness. We turn our well-informed guesses
into true or false statements. That's useful in some contexts, but it's
not the whole story.

Finally, when you speak of tags on item and tags on works, it drifts
away from what makes tagging good. Users tag because it's quick, almost
thoughtless. The way you see it is the way you tag it. For the idea to
be useful, it would need to be implemented. That means telling users to
tag some things in one box and other things in another. Users don't want
to be information professionals.

Tim