On 10/13/2013 7:27 AM, Alexander Johannesen wrote:
<snip>
I'd just like to add a few bits about why Linked Data is or was
important. It's not really about sharing the data anymore, it has become
almost a secondary nice to have feature of meta data; surely you give
out the meta data in order to make things findable? No, the real
importance of why the library world should have been quicker and smarter
about it is about namespace real-estate, and the power of identifiers,
and it's this subtler connection in which things are truly found.
...
So, for example, we want to talk about Mark Twain. I could link my data
to a URI (which is just a string of letters to make up an identifier;
that it's a URI that you can plonk in a browser or do a HTTP GET on to
resolve it is an added bonus) so that we can make sure that when I talk
about Mark Twain, I mean the Mark Twain that is linked to this one
http://id.loc.gov/authorities/names/n79021164
And wouldn't it be great if that was the case?
</snip>
[Sorry for the long message, but it is usual with me and I can't find a
way to make it simpler]
If this is not just rhetorical question but one that is seriously asked,
then I have an answer that, so far as my own experience is concerned is
definitive (although others may have other experiences): Wouldn't it be
great if that was the case? The answer is decidedly no.
When id.loc.gov first came out I *really, really, really* wanted to
include it into my catalog in some way. I don't believe the API had come
out yet, but there are other ways if you are creative enough although
they may not be perfect. I showed it to several of my users (students
and faculty) and while they found it kind of neat, especially the
"Visualization" tool, it did not provide them with any information they
thought would be useful for their purposes. I think this offers a clear
example of looking at a tool like this as a developer, as a cataloger,
and as a user.
The underlying purpose of the kind of record we see in id.loc.gov is
*not* so much to provide *data* to manipulate in all kinds of new and
wonderful ways, but to help people discover information that is *within
a particular collection*. So, with the record for Mark Twain, what is
there? We find various forms of his name, which is not important in and
of itself, but it is there so that when someone searches for e.g.
"Tuwen, Make", people see a reference that says: "See: Twain, Mark,
1835-1910." (http://1.usa.gov/162o37r)
In this case with Mark Twain, you also discover that he has different
"bibliographic identities" (in cataloger-speak), which translates into
normal speak as: if you want to find everything by Mark Twain, you also
have to look under the names:
Clemens, Samuel Langhorne, 1835-1910
Conte, Louis de, 1835-1910
Snodgrass, Quintus Curtius, 1835-1910
The rest of the information in the record is for catalogers, documenting
where the information for each form of name came from and maybe some
more. So, for the user, this information is good only for *resource
discovery* within the realm of the *specific catalogs* that use these
forms. Other catalogs have different rules and different forms. For
example, pre-AACR2 rules (but lots of other rules too) treat the concept
of "bibliographic identities" differently and the heading to search for
everything by Mark Twain was only "Clemens, Samuel Langhorne,
1835-1910". We can see how this was handled in the transition at
Princeton University with the first card under "Clemens, Samuel"
bit.ly/1ajsS8s <http://bit.ly/1ajsS8s>but if you browse to the next
cards, you will see that his books are under "Clemens" as was correct
before AACR2.
So, the only real information from id.loc.gov that is of use to the
public is that they have to look under three other forms of name to find
everything by Twain. To revive this type of information would only
result in creating a tool that begins to work the way the catalog was
designed to work (i.e. back in the 19th century). That is important, by
the way.
If we look for an author who did not use pseudonyms, all we see are
different forms of the name, e.g. "Goethe, Johann Wolfgang von,
1749-1832" http://id.loc.gov/authorities/names/n79003362.html It is of
minimal use for the user to know that Goethe has also been published
under "Ko-tê, 1749-1832" although if they search for "Ko-tê" they will
find the reference to Goethe.
When we use the VIAF http://viaf.org/viaf/50566653/ we get something
that may be more useful more useful to the public, which is the correct
form of name to search in different catalogs. So, we discover we need to
search "????, ???? 1835-1910" in Russian catalogs, and in Arabic
catalogs, ????? ????? 1835-1910
A tool could be made to search Mark Twain's Russian form of name
automatically in the correct catalogs, e.g. http://bit.ly/1boDipB in the
Russian catalogs. That may--or may not--be useful to someone to know
that materials cataloged in Russia use this form and can be searched
correctly.
In Worldcat Identities http://www.worldcat.org/identities/lccn-n79-21164
we find different information derived from the catalog. We see genres,
roles, his most widely held works and a word cloud of his subjects.
Worldcat Identities, and especially the word cloud at the bottom *may*
be of the most use to the public of all of these tools, but it needs to
be tested. Again, when I have showed these tools to people, although
they found them interesting, they could not tell me how those tools
could help them in any substantive way in anything they could imagine.
Compare these tools to dbpedia http://dbpedia.org/page/Mark_Twain that
gives lots of concrete information and tons of links about Mark Twain.
Today, all this can be linked together with linked data (which can
definitely be done) but following John Marr's questions, it seems to me
to do so would be to create the very definition of "information overload".
I want it clear that I am *not* saying that some kind of tool should not
be built, because it definitely should be built, but we must look at it
through the eyes of the person consuming it. Otherwise, we may be
creating something for *us* and not for the people who need to use it.
Linked data may end up creating a different kind of chaos. This is why I
say that linked data *may* create something useful for the public, but
it just as well confuse them more than ever.
<http://dbpedia.org/page/Mark_Twain>
--
James Weinheimer weinheimer.jim.l_at_gmail.com First Thus
http://catalogingmatters.blogspot.com/ First Thus Facebook Page
https://www.facebook.com/FirstThus Cooperative Cataloging Rules
http://sites.google.com/site/opencatalogingrules/ Cataloging Matters
Podcasts http://blog.jweinheimer.net/p/cataloging-matters-podcasts.html
Received on Sun Oct 13 2013 - 16:58:28 EDT