Re: OCLC Formally Withdraws WorldCat Policy

From: Weinheimer Jim <j.weinheimer_at_nyob> Date: Fri, 10 Jul 2009 15:27:11 +0200 To: NGC4LIB_at_LISTSERV.ND.EDU

Tim Spalding wrote
> "Compare for example a search in Google Squared using "Books on
> Fish"
> (http://www.google.com/squared/search?q=books+on+fish) with a
> catalogue search for Dewey = 597.2 It is like the difference between
> today's web and the Semantic web."
> 
> *Google is:
> 
> "Use algorithms, including the full text of books, to guess which
> books are about fish."
> 
> *Dewey is:
> 
> "Give me books that catalogers who usually haven't read the book
> *think* are about fish, minus categories we arbitrary decide aren't
> about fish, like fishing fish, fish culture, fish as pets, cooking
> fish, fish ponds, anything for younger people or fiction, and assuming
> a nonsensical model of the world in which books can only be "about"
> one thing, so if it's a book of maps of fish, fish oils or fish
> genomes well, hey, it might be anywhere but, while you'll have to
> guess, we're not guessing since we have a system!"
> 
> This isn't the difference between the web and the semantic web. It's
> the difference between a spaceship and a Stanley Steamer. Except that
> DDC was already out of date when the Stanley Steamer was created.

While the catalogers don't read the books they catalog, neither do the computers, which can't "read" but only ingest text. So in this sense, Google is not:
> "Use algorithms, including the full text of books, to guess which books are about fish."

but
"Use the algorithms written by nerds who probably haven't seen the sun in 5 years and spend all their extra time killing Orcs in World of Warcraft or doing weird stuff in Second Life, to try to figure out if certain strings of text may be about fish, but you may just as well be looking at items by Country Joe and the Fish or from the National Inquirer about babies born with fish heads."

At least with professional catalogers, they have been trained in subject analysis, which is not that simple of a task. Naturally it can be done better, but authors mess it up all the time in adding subjects to their own books and articles. (Strangely enough, in my own experience, the subjects they choose are normally too broad, not too specific)

And at least the students that I work with mostly find Google-type results too incoherent if they are doing something serious for a class. Certainly, they can find some excellent materials, but it's primarily serendipitous. Again, I hope the power of the two methods can be merged, but my concern is: one is expensive (ie. demands human labor, which is strictly verboten for many today) the other is "cheap" (even though the computers may cost millions more than the humans, with bizarre results).

As John Lennon put it: "Strange days indeed."

Jim Weinheimer