Re: FRBR WEMI and identifiers

From: Ross Singer <rossfsinger_at_nyob>
Date: Wed, 18 Nov 2009 12:51:37 -0500
To: NGC4LIB_at_LISTSERV.ND.EDU
Jim,

I tried to address much of what you're talking about here in this post:
https://listserv.nd.edu/cgi-bin/wa?A2=ind0911&L=NGC4LIB&T=0&F=&S=&P=113503

But I'll touch on a few other points, as well.

First, I'd like reiterate:  the limitations of SKOS are not a
limitation of RDF or id.loc.gov/authorities.  RDF allows you to pick
and choose any number and any variety of vocabularies to model your
data.  LC chose SKOS to lay the groundwork because 1) it exists
already 2) people understand it and know how to use it 3) it tackles
the important part of defining the concepts, giving them URIs and
associating them with particular schemes (topicalTerm, genreFormTerm,
geographicLocation, personalName, etc.) as well as the BT/NT
relationships.

As soon as there is an agreed upon way to model coordination, LC can
just layer that in on top of what's already there (since there's no
reason it would have to change), but there's no point in doing this
until there's some agreement on how to implement it, since nobody will
know what to do with it, anyway.

The rest of this I'll reply inline:

On Tue, Nov 17, 2009 at 3:22 PM, Weinheimer Jim <j.weinheimer_at_aur.edu> wrote:

> Not at all, but please point out to me how I can take "Communication--Political aspects--United States" using the power of the semantics contained withing the 650 field:
> 650 \0$aCommunication$xPolitical aspects$zUnited States (topical subject - topical subdivision - geographical subdivision)
> to get a more flexible display of the type I pointed out:
> topical subdivision - geographical subdivision - topical subject
> Political aspects United States Communication,
> using the RDF from
> http://id.loc.gov/authorities/sh2009120881#concept
>

So with that particular resource, the RDF looks like:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .

<http://id.loc.gov/authorities/sh2009120881#concept>
    dcterms:created
"2009-08-04T00:00:00-04:00"^^<http://www.w3.org/2001/XMLSchema#dateTime>
;
    dcterms:modified
"2009-08-04T08:30:42-04:00"^^<http://www.w3.org/2001/XMLSchema#dateTime>
;
    dcterms:source "Work cat.: The ideology of international
communications, 1992"@en ;
    a skos:Concept ;
    owl:sameAs <info:lc/authorities/sh2009120881> ;
    skos:inScheme <http://id.loc.gov/authorities#conceptScheme>,
<http://id.loc.gov/authorities#topicalTerms> ;
    skos:prefLabel "Communication--Political aspects--United States"@en .

So imagine something in future like:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .

<http://id.loc.gov/authorities/sh2009120881#concept>
    dcterms:created
"2009-08-04T00:00:00-04:00"^^<http://www.w3.org/2001/XMLSchema#dateTime>
;
    dcterms:modified
"2009-08-04T08:30:42-04:00"^^<http://www.w3.org/2001/XMLSchema#dateTime>
;
    dcterms:source "Work cat.: The ideology of international
communications, 1992"@en ;
    a skos:Concept ;
    owl:sameAs <info:lc/authorities/sh2009120881> ;
    skos:inScheme <http://id.loc.gov/authorities#conceptScheme>,
<http://id.loc.gov/authorities#topicalTerms> ;
    skos:prefLabel "Communication--Political aspects--United States"@en;
    lcsh:coordinates
<http://id.loc.gov/authorities/sh85029027#concept>,
<http://id.loc.gov/authorities/sh00005651#concept>,
<http://purl.org/NET/marccodes/gacs/n-us#location> .

To use the simple example Ed gave before.  I have to use the marccode
URI for the US because corporate names (which political geographic
entities fall under) aren't in id.loc.gov, yet.  Of course, we _can_
do this, because http://purl.org/NET/marccodes/gacs/n-us#location
knows that it's got an analogous authorized subject heading and, once
it actually exists, would point at it.

But that way we have the linkage between this concept uri and the
topicalTerm concept http://id.loc.gov/authorities/sh85029027#concept
"Communication", the generalSubdivision concept
http://id.loc.gov/authorities/sh00005651#concept "Political aspects",
and the geographicSubdivision concept "United States" (when it
exists).

This has nothing to do with the string skos:prefLabel
"Communication--Political aspects--United States"@en other than the
fact that the prefLabel is generated, by convention, as a
concatenation of these resources.

> It was my understanding from everything in this thread that this cannot be done using RDF and then Karen pointed out that the problem is with SKOS. That's fine, but the final result is the same: that what we have in id.loc.gov is totally inflexible.

No, that it was built on SKOS is what makes it totally flexible.  Much
more flexible, honestly, than your example of www.biblio.tu-bs.de,
which restricts you to the relationships that www.biblio.tu-bs.de
makes available and _only_ those relationships (and you can't reuse
them for your own purposes anyway).

>That's why I said that other technologies may be needed to do these sorts of things. I understand very clearly that the purpose of the RDF files from the id.loc.gov site are supposed to be there only for referencing, but I also tried to point out that this is not nearly enough to get a web designer to use it in reality. The web designer must see added-value, and on the web, this means links above all else. Can you say why somebody would use id.loc.gov and not dbpedia? And even if we placed our links into the relevant URIs in dbpedia, there would still be no added value since there would still be no links.
>

Again, I still don't see why any web designer would use dbpedia.org,
period.  You have yet to explain why you're pointing people at
dbpedia, but let's back up.

Your web designer's relationship with id.loc.gov/authorities is
roughly the same as your web designer's relationship with OCLC
Connexion or http://www.galileo.aur.it/cgi-bin/koha/admin.pl (or
whatever).  It's an identifier for subject headings to use in the
data.

But, going back to your general question, "should be give up LCSH and
use dbpedia instead?" -- my question would be, without something like
id.loc.gov/authorities, how would this even be remotely possible,
anyway?

Your basis for this thread was to mitigate the effort and expense of
our current cataloging process by ignoring RDA and FRBR and, instead,
tweaking AACR2.  But then you ask if we should drop LCSH for dbpedia.
These seems completely disjoint.  How would we begin to justify the
retrospective conversion?

Instead, id.loc.gov provides /exactly/ the sort of linkage between
resources like dbpedia, geonames, etc. and our legacy data.

As an aside, you can't "place your links" in dbpedia.  Dbpedia is
extraction of wikipedia's "infobox" data that's done on a quarterly
(or so) basis.  The dbpedia folks decide what links will be in there.
This means it's much more useful to "place our links" in wikipedia or
freebase (which is both structured /and/ editable).

If the id.loc.gov uris were in freebase or dbpedia, it's then trivial
for id.loc.gov to reify them back to those resources.  Meanwhile,
lcsubjects.org is intended to exist as a sandbox of sorts for these
sorts of relationships, deferring to id.loc.gov as the 'authority'.

> If I am wrong, please enlighten me, I would love to be wrong on this and learn that perhaps the LCSH may be of real use to everyone, but please focus on what is possible today, now, with the tools and data we have at hand, not what might be done after 10 years and the willing cooperation of half of the people on the internet, because there may be many other solutions available in 10 years, and I don't know how much cooperation we'll get. People have  been waiting for a long time already to see something cool emerge from libraries and almost nothing has happened.

What I'm talking about is completely possible today.  In fact, it was
possible around two weeks ago when I trotted out:

http://purl.org/NET/lccn/36029351#i

and

http://purl.org/NET/lccn/93710188#i

and

http://purl.org/NET/lccn/85006714#i

All of this, /every assertion/ is made from the original MARC.  When
http://id.loc.gov/authorities/sh85005167#concept (or
http://lcsubjects.org/subjects/sh85005167#concept -- it doesn't
matter) is related to
http://dbpedia.org/resource/List_of_animal_sounds (which is an easy
match -- this will even appear in the original lcsubjects.org label
match roundup) or http://id.loc.gov/authorities/sh87004213#concept to
http://dbpedia.org/resource/Folk_rock or
http://viaf.org/viaf/32830310.rwo to
http://dbpedia.org/resource/Susanna_Rowson things become even more
powerful.

But it's totally inefficient to focus on the decentralized "records".
Instead we have to focus around agreed identifiers for it to work.
It's why I chose to build this on LCCN.  It's an extremely common
record identifier and, since it's not legally possible for me to do on
Worldcat, a pretty good example of what's possible to the maximum
effect.

-Ross.
Received on Wed Nov 18 2009 - 12:54:11 EST