Mapping vocabularies (was: LCSH and Linked Data)

From: Jakob Voss <jakob.voss_at_nyob>
Date: Fri, 8 Apr 2011 10:10:22 +0200
To: CODE4LIB_at_LISTSERV.ND.EDU
Hi,

Any transformation of a controlled vocabulary, either in format (MARC to 
RDF) or in coverage (e.g. vom LCSH to DDC, MeSH, GND, etc.) has to 
decide whether

(a) there is a one-to-one (or one-to-zero) mapping between all concepts
(b) you need n-to-m or even more complex mappings

Mapping name authority files in VIAF was one of (a) because we more or 
less agree on hat a person is always the same person. But

It looks like mapping authority data in MARC from different institutions 
is an instance of (b). Not only are concepts like "England" more fuzzy 
than people, but they are also used in different context for different 
purpose, depending on the cataloging rules and their specific 
interpretation. It does not help to argue about MARC field because there 
just is no easy one-to-one mapping between for instance:

- The Kingdom of England (927–1707)
- The area of the Kingdom of England (927–1707)
- The country England as today
- The area of England including the Principality of Sealand
- The area of England excluding the Principality of Sealand
- The whole Island Great Britain
- The Island Great Britain including Ireland
- The Island Great Britain including Northern Ireland
- The Kingdom of Great Britain (1707 to 1801)
- The United Kingdom of Great Britain and Ireland (1801 to 1922)
- etc.

I gave a talk about the fruitless attempt to put reality in terms of 
Semantic Web at Wikimania 2007 (stating with slide 12):
http://www.slideshare.net/NCurse/jakob-voss-wikipedia2007

Instead of discussing how to map terms and concepts "the right way" you 
should think about how to express fuzzy and complex mappings. The SKOS 
mapping vocabulary provides some relations for this purpose. I can also 
recommend the DC2010 paper "Establishing a Multi-Thesauri-Scenario based 
on SKOS and Cross-Concordances" by Mayr, Zapilko, and Sure:
http://dcpapers.dublincore.org/ojs/pubs/article/viewArticle/1031

If you do not want to bother with complex mappings but prefer 
one-to-one, you should not talk about differences like England as 
corporate body or as England as place or England as nationality etc.

Sure you can put all these meanings into a broad and fuzzy term 
"England" but than stop complaining about semantic differences and use 
the term as unqualified subject heading with no specific meaning for 
anything that is related to any of the many ideas that anyone can call 
"England". This is the way that full text retrieval works.

You just can't have both simple mappings and precise terms.

Jakob

-- 
Jakob Voß <jakob.voss_at_gbv.de>, skype: nichtich
Verbundzentrale des GBV (VZG) / Common Library Network
Platz der Goettinger Sieben 1, 37073 Göttingen, Germany
+49 (0)551 39-10242, http://www.gbv.de
Received on Fri Apr 08 2011 - 04:12:54 EDT