Re: Tim Berners-Lee on the Semantic Web--Missing His Main Point

From: Karen Coyle <lists_at_nyob> Date: Fri, 23 Oct 2009 06:32:29 -0700 To: NGC4LIB_at_LISTSERV.ND.EDU

James Weinheimer wrote:
> The information world could be working with our data *right now* but since
> we feel it's not in the "best" format using linked data, in effect we are
> preventing people from using our data at all. As Berners-Lee said, what is
> absolutely vital is that your data exists out there, even if it's only in a
> CSV. I agree that only putting up the "text" before it is transformed into
> linked data is much inferior to what it could be, but otherwise, we provide
> people with nothing at all.
>   

Well, it's "out there" in MARC, in the sense that anyone can pull 
records out of library catalogs (and some folks do). The full library of 
Congress file is available for downloading from the Internet Archive, as 
is the entire Open Library (23 million bib records), which gives out a 
full dump periodically (although not in MARC). Also, there is a lot of 
bibliographic data on the Web in various formats -- for example, Amazon 
has a tremendous amount of bibliographic data. LibraryThing has used 
library data, as well as data from amazon and other sources.

The question isn't "best" but "linked" if we want re-use. The RDA 
properties have been defined in the Metadata Registry in a way that can 
be used for linked data. True, it's RDA, but the overlap in actual data 
elements between AACR and RDA is pretty complete. The controlled 
vocabularies for bib description are also there. And LC has has put up 
LCSH in a linked data format, although it's not the complete headings, 
just the authority records (which generally get extended when headings 
are created). So basically that's all that's needed. It should be 
possible today to put bibliographic data on the web in a linked format.

Solving the WEMI problem isn't necessary for this. The WEMI problem, in 
my mind, has to do with sharing cataloging and making that process 
richer as well as more efficient. It may also facilitate some uses, like 
work-based displays, but those can be derived today.

But there's one other thing: OCLC. Remember the OCLC proposed policy? I 
have been in the position of asking libraries for dumps of their 
catalogs, and mostly they say "no" because they don't believe that the 
records are theirs to give. It may seem either silly or trivial, but in 
the US it is taken very seriously. It's rather ironic that there are 
libraries letting Google digitize the books in their collections, but 
they won't let anyone have their catalog of metadata. (Yes, the policy 
is under revision, but it will have as one of its purposes the 
protection of the WorldCat database... that has been made clear.)

kc

-- 
-----------------------------------
Karen Coyle / Digital Library Consultant
kcoyle@kcoyle.net http://www.kcoyle.net
ph.: 510-540-7596   skype: kcoylenet
fx.: 510-848-3913
mo.: 510-435-8234
------------------------------------