Re: "Third Order"--was Libraries & the Web

From: Corey A Harper <corey.harper_at_nyob>
Date: Wed, 16 May 2007 22:52:45 -0400
To: NGC4LIB_at_listserv.nd.edu
Absolutely, Casey.

My OCLC reference was only an example.  Probably a poorly chosen one, at
that.  I agree that the idea is to decentralize this stuff.  Part of the
point of RDF is that it provides a common model for disparate,
distributed data systems, while providing some of the precise tools
we've come to expect of homogeneous XML and database environments.

I think I agree that trying to demand strict compliance to a particular
data model or markup system is counterproductive.  I think the real
value lies in mapping or binding existing data, which, while loose, is
at least consistently formatted, to models that are starting to get
critical mass.  Unfortunately, that also takes huge amounts of work.
That's why I see so much promise and significance in the RDA / DCMI /
IEEE-LOM Data Model agreement Karen Coyle posted here and blogged about.

An RDF vocabulary for RDA terms, combined with other work that's
happening on that front, such as the FRBR core concepts expressed as RDF
  and Simile's RDF ontology for MODS, moves us closer to having a formal
model for the data in all of our legacy MARC records and our controlled
vocabularies.  I think part of the unnecessarily negative reaction some
have to the idea is rooted in the misconception that it further
constrains, simplifies, or otherwise breaks what's valuable about
current cataloging practice.  That's simply not true.  If anything, it
enriches that already strong tradition.

-Corey

Casey Bisson wrote:
> Don't take the following as suggesting I'm against such an idea, just
> a few thoughts about how I'd like it to work.
>
> Most of our systems require significant pre-coordination and absolute
> relationships, but the web (and much of its success) stands in
> contrast to that.
>
> Google could have been built by requiring all websites to register
> their content and report their links, but it wasn't. And I think we'd
> all agree that it wouldn't work as well if it was.
>
> The library world is smaller, so it's somewhat less fantastic to
> expect that type of relationship here, but still I think there's
> something to be learned from the loose relationships found on the web.
>
> It's harder to describe how such systems might work, but the web
> teaches us that it's easier to implement and build on systems that
> allow loose relationships than those that demand strict compliance.
>
> What I'm really arguing for is leveraging what we've already got:
> we're publishing our catalogs to the web, so let's make sure we're
> putting them out there with good semantic markup so it's easy to
> parse the data out of them. That way we could build spiders that
> harvest that data from all those decentralized catalogs.
>
> What we do with it from there is another matter, but here's the big
> win: the architecture allows us to try lots of things in parallel,
> each making our own decisions about how to use it. That's important,
> because it will take us a while to make sense of our theories of how
> this does or should work, and it'll allow us to evolve more
> organically than with a centralized database.
>
> --Casey
>
>
> On May 16, 2007, at 7:56 PM, Corey A Harper wrote:
>
>> Imagine if OCLC's database were built around this principle, and a
>> nightly SPARQL query could retrieve any statement Doug made about a
>> resource when he added a field or subfield, and add it to the data-
>> pool
>> of a local catalog.  Imagine then that another query could pull in the
>> tags that Bob added to Library Thing and the reviews Sarah posted
>> on Amazon.
>
>
>
> Casey Bisson
> __________________________________________
>
> Information Architect
> Plymouth State University
> Plymouth, New Hampshire
> http://MaisonBisson.com/blog/
> ph: 603-535-2256

--
Corey A Harper
Metadata Services Librarian
Bobst Library
New York University
70 Washington Square South
New York, NY  10012
212.998.2479
corey.harper_at_nyu.edu
Received on Wed May 16 2007 - 20:43:50 EDT