Re: OCLC and Michigan State at Impasse Over SkyRiver Cataloging, Resource Sharing Costs

From: Adrian Pohl <pohl_at_nyob> Date: Wed, 10 Mar 2010 17:24:59 +0100 To: NGC4LIB_at_LISTSERV.ND.EDU

Adrian Pohl
Direktionsassistenz (Manager's assistant)
hbz - Hochschulbibliothekszentrum des Landes NRW
Tel: (+49)(0)221 - 400 75 235
http://www.hbz-nrw.de

Besuchen Sie das hbz auf dem 4. Leipziger Kongress für Information und
Bibliothek an Stand +11 auf Ebene 1! 

>>> Weinheimer Jim <j.weinheimer_at_AUR.EDU> 10.03.2010 09:24 >>>
> I agree that holdings of a library should be "visible to
researchers," but this is becoming far more complex a task than it used
to be. Just making record in the local catalog and throwing it into
Worldcat is definitely not enough today. To compensate, there are many
more avenues available today than ever before.
>
> I console myself with the thought that solving these problems could
turn out to be one of the most fascinating eras in the history of
librarianship!

Jim, you are absolutely right. This discussion reminds me of my first
question to this mailing list in January 2009: 

> I'm working for a german library consortium. Part of our member
libraries 
> are interested in a participation in WorldCat. We are currently
trying 
> to evaluate the possible benefits a WorldCat-participation would
bring. 
> One of the benefits would be what OCLC calls "web scale for
libraries". 
> At the moment we are trying to better estimate the possible benefits
in this 
> area. So, it would be very helpful if you could provide me with some
statistics 
> about the development of OPAC accesses via WorldCat.org since ist
start in 
> August 2006.

I am still interested in some numbers although the statistics Tim
referenced already make a clear point: WorldCat.org doesn't have a
significant impact on the web. So I believe that single library catalogs
haven't seen any significant growth in catalog access after being
visible in WorldCat.org. The consortium I am working at decided to not
pay the high prizes for uploading the holdings to WorldCat. The
alternative we started exploring is migrating library data as Open Data
to the Linked-Data-Web.

I believe the way to go and lots of the relevant questions & problems
for the library community to really get a part of the web have been
addressed in this worthwhile thread. I try to summarize important points
and will add my own thoughts:

1.) The first necessary step is a political and legal one. It simply
got to be said: "We publish our raw data to the public domain."

2.) After this necessary first step we need to develop an open data
practice. This addresses questions like describing and versioning
datasets and registering them in one place (like the CKAN group for
bibliographic data: http://ckan.net/group/bibliographic). (Nat
Torkington recently wrote a very interesting post concerning this:
http://radar.oreilly.com/2010/03/truly-open-data.html) Eventually the
data might be made useful for copy cataloging but it won't advance the
libraries' visibility on the web.

3.) So we don't just want to share the data and version it in the form
of existing ancient standards (MARC or in Germany we have MAB =
Maschinelles Austauschformat für Bibliotheken ~ Automatic interchange
format for libraries which dates back to 1973). We want to transform it
into data which other people outside the library world can use. So we
need an open environment for collaborative work where we can share code,
documentations and best practices.

4.) Linked Data is one way for becoming part of the web. To walk this
way we have to experiment with the best way of publishing our data to
the Linked Data Web. This means creating an ontology for the data at
hand and migrating the different data sets in different formats to RDF
(the standard data model for Linked Data). Using RDA as the underlying
ontology does imply spending lots of time and energy in making it fit to
one's legacy data. Much work has already been done developing RDA but it
still has to be made attractive to use. (See
http://catalogsofbabes.wordpress.com/2010/03/08/rda-why-it-wont-work/
for the difficulty of RDA.) I see RDA as a kind of überontology from
which individual institutions derive there "application profiles" which
focus only on a part of the world RDA describes. After some engagement
with RDA and the RDA metadata registry
(http://metadataregistry.org/rdabrowse.htm) I know that this won't be
easy. (BTW, it is unfavourable that the RDA metadata registry is
published under a BY-NC-SY-licence. It rather should be put in the
public domain.) An alternative to RDA could be building on an ontology
based on Dublin Core or the already used Bibliographic Ontology
(http://bibliontology.com/). 

5.) Migrating our existing data to RDF isn't enough and might not even 
be the most important task. Web based cataloging services have to be
developped simultaneously to produce bibliographic data as Linked Data.
The big vendors hesitate to pick up this challenge as they naturally
aren't interested in Linked Open Data - because you can't make money
from metadata that's in the public domain. (Last year we did a well
received conference "Semantic Web and libraries" (http://www.swib09.de)
- but no representative of any vendor would attend...)

Adrian