Re: Tim Berners-Lee on the Semantic Web

From: Karen Coyle <lists_at_nyob>
Date: Thu, 29 Oct 2009 08:49:30 -0700
To: NGC4LIB_at_LISTSERV.ND.EDU
Alexander Johannesen wrote:
> Are you referring to my example of the UN? Of course we would use
> that. If people aren't using shared URIs the whole point of the
> Semantic *Web* falls apart. It is supped to be one massive shared
> triplet-store, where "sameness" / identity is in the equality of URI
> strings.
>
>   

Hi, I wanted to get back to this because my original comments were so 
inarticulate... sometimes it takes a while to think about these things.

You used an example of http://www.un.org as an identifier for the 
organization United Nations, and you asked:

But if I say that a subject is about
"http://www.un.org/", is my subject the UN as an organisation, or
their website as a whole, or the page that HTTP returns?

What I see here is a problem of repurposing a *location* as an *identifier*. But I think there is a simple solution. Here are two statements:

    (something) has subject (http://www.un.org)
    (something) has subject [organization that has home page (http://www.un.org)]

The first one helps us link data only if we have an agreement on what we will use for the identifier. However, as I believe is often the case for things that exist in the "real world", communities will have different identifiers for the organization (the LC name authority heading, a number of different standards for institution codes, etc.). We can't know what each other's identifiers are. We can, however, all know what the home page of the organization is, or what it calls itself in English. So, using LCNA as an example, we could have:

    [1] (n79021345) has home page (http://www.un.org)
     
    [2] (n79021345) calls itself (lang=en / United Nations)

Statement [1] will link to any other statement with: 
    ... has home page (http://www.un.org)
as part of its triple, and statement [2] does the same with triples containing:
    ... has preferred label (lang=en / United Nations)

Although the latter, being a language string, is unlikely to have the necessary uniqueness. It can be used, however, as part of an inference decision that two triples may be referring to the same thing. Another example of this is the use of email addresses to "identify" persons. It's an identifier that works well in certain circumstances, but you wouldn't want to lose the fact that what you're working with is an email address, since that fact becomes part of the inference capabilities.

So my bottom line is that when using an existing element (whether a URI or a string), keeping the original meaning will allow better linking/inferencing, especially if it is a meaning that is likely to be independently discovered by others. 

kc

-- 
-----------------------------------
Karen Coyle / Digital Library Consultant
kcoyle@kcoyle.net http://www.kcoyle.net
ph.: 510-540-7596   skype: kcoylenet
fx.: 510-848-3913
mo.: 510-435-8234
------------------------------------
Received on Thu Oct 29 2009 - 11:52:29 EDT