Re: FRBR WEMI and identifiers

From: Weinheimer Jim <j.weinheimer_at_nyob>
Date: Fri, 20 Nov 2009 09:48:21 +0100
To: NGC4LIB_at_LISTSERV.ND.EDU
Ross,
While I am finding problems with many of the implementations of the LCSH linked data project, I wholeheartedly applaud the efforts behind it. It has taken far too much time, but that is very normal in libraries. At least, things now are finally going in the right direction, and in fact, I am a big fan of yours because I think it is only through your efforts with lcsh.info that things finally started moving. Without your work, I suspect there would still be nothing. I feel really lucky to have had this exchange with you, since I have learned a *lot* that otherwise, I would never have known.

I am not criticising LC either. I realize that there is simply too much to do right now and the future directions of our field are very murky, in many ways. But we still must be forthright in our criticism, especially now in these very difficult economic times. For example, a couple of days ago that according to a US govt. report, 1/3 of all children in the US went hungry last year. (http://www.globalissues.org/news/2009/11/16/3526) This has made a rather deep impact on me in several ways, but one conclusion I think is inevitable: I think that more money and resources will (and must) go toward feeding and helping people, and without lots of new money coming in and the efforts to keep the budget in balance, the inescapable conclusion is that there will be less for efforts such as id.loc.gov. State and private institutions are cutting 'way, 'way back. Look at California and Harvard. These are some of the most serious times of our lives.

You mention Koha, and perhaps the open source/open access movement offers a way out. There are lots of people with lots of skills and ideas out there willing to experiment for nothing, so long as you give them data to play with that it won't take them 2 years to master, plus a bit of motivation. In many ways, this is a frightening road to take since you wind up losing "control" of how your data is used, but otherwise, there is no way forward, it seems to me. 

One statement I read about Google some time back (I don't remember who or where) but someone said that the secret to Google's success is that they created tools to let *others* succeed. I like that idea. Perhaps we can take some of that mentality into other, more formal areas, such as in the FRBR user tasks.

Jim Weinheimer  

Jim, I honestly understand your concern.  I also understand the lack
of faith that a solution will appear in a timely fashion.  We are,
after all, not just talking about libraries, but the Library of
Congress, which has never been known for its haste.

Still, when you think about it, there's cause for some optimism on
this front (and, honestly, in many ways our optimism is unnecessary --
more on this in a minute).  lcsh.info took only a month or two to go
from the spark of inspiration to functional prototype.  It was up for
a couple of months, getting tweaks and refinements here and there and
then unceremoniously shut down.

When it finally reemerged in its current, double-breasted, navy suit,
it was less than a year later after initial conception to fully
supported LC service.  What I'm saying is that this is hardly a
typical LC joint.

It's also undergoing constant iterative improvement.

I guess what I'm saying is that id.loc.gov isn't really operating on
library time.

But, at the end of the day, what's so special about id.loc.gov is that
/it doesn't matter what sort of time it runs on/.  If you are
unsatisfied with the level of atomicity that id.loc.gov uses to
describe their concepts, you have the full power to apply those
assertions yourself.

As an analogy, your institution uses Koha (it is enormously convenient
that I am having conversation with you, btw).  I assume you didn't
pick Koha for its out-of-the-box OPAC interface.  But, because Koha is
open source, you have the full power to do anything you want to do to
improve the UI for your community, and if those improvements are
generally useful, you can contribute them back into the Koha project
for others use.

The same applies to linked open data.  LC doesn't necessarily /have/
to provide the triples that enable concept coordination (although,
obviously, it would be a lot easier if they did), because by making
the framework available for all of these concepts to hang off of,
somebody with the time and inclination can do it instead.

-Ross.

On Thu, Nov 19, 2009 at 8:56 AM, Weinheimer Jim <j.weinheimer_at_aur.edu> wrote:
> Ross,
>
> I really appreciate the indepth answer you provided, but I still have some problems.
>
> First, your example of the SKOS:
>    owl:sameAs <info:lc/authorities/sh2009120881> ;
>    skos:inScheme <http://id.loc.gov/authorities#conceptScheme>,
> <http://id.loc.gov/authorities#topicalTerms> ;
>    skos:prefLabel "Communication--Political aspects--United States"@en;
>    lcsh:coordinates
> <http://id.loc.gov/authorities/sh85029027#concept>,
> <http://id.loc.gov/authorities/sh00005651#concept>,
> <http://purl.org/NET/marccodes/gacs/n-us#location> .
>
> is fine and I believe does exactly what I have been saying that we need. but as you say, we must imagine this sometime in the future since it doesn't work now (not only because the term United States is not yet avaialble, but because the system is not set up that way. i.e. there is currently no link from
> http://id.loc.gov/authorities/sh2009120881
>
> to either:
>
> <http://id.loc.gov/authorities/sh85029027#concept>,
> <http://id.loc.gov/authorities/sh00005651#concept>,
>
> The reason this does not work currently is because everything is still based on how people browsed a card or printed catalog. It all made perfect sense before, but fell apart with keyword searching. I think I need to stop and explain this because it may be becoming "lost information." For those who know this already, I apologize in advance.
>
> If someone wanted to find books on the politics of communications in the U.S., they would open the "C" catalog drawer and begin going through the cards until they would come to "Communication," which--in theory--would be a raised card with the information now available at http://id.loc.gov/authorities/sh85029027#concept printed on it. They would read and learn whatever was on this raised card, then they would continue to browse (for quite awhile sometimes) until they ran across the subdivision "Political aspects" and continue to "United States."
>
> In reality, it never worked that well because librarians were scared that the catalog would get too big, so they placed very few guide cards into the catalog, and as a result, almost all of the cross-references were found only in the red books. As a result, the red books were vital for the searcher to get all of these cross-references and such, but relatively few people actually used them. (I confess I did not understand their importance until library school, and I know I am not alone! BTW, a discussion is going on about the red books now on Autocat) People, including me, nevertheless muddled through somehow.
>
> This system worked even worse when computers arrived with keyword since people ceased browsing the headings as they were supposed to, and with keyword searching, they would jump right into a record placed in the *middle of the file,* then see the subjects, and choose "Communication--Political aspects--United States." When they clidked on this link (if the system allowed it) you would be thrown into the *middle* of the old, card catalog browse list and not at the beginning as it was designed to work. This is how the LC catalog works right now. But the searcher still needs the information found under "Communication" plus lots more along the way, and now, the only way to get this information is to browse up and up to the top, often, after many, many screens. Of course, nobody does this except weirdos like me who understand how it is *supposed* to work. But, it's still a pain to do it and there must be something better.
>
> Therefore, the link from "Communication--Political aspects--United States" to "Communication" is absolutely critical if the headings are to be useful, since the traditional method of browsing does not work anymore, and hasn't for a long time.
>
> Therefore, while the structure you point out may work in the future, it doesn't appear to right now, and we are forced to imagine. The trouble with imagining is: I and lots of other people can imagine a lot and once people begin imagining, they can imagine how much more they could and should get, instead of only the internal relationships to "Communication" "Political aspects" and "United States." I think something like: http://dbpedia.org/page/Category:Communication would be found pretty useful by lots of people out there. Also, I would like some level of real world searches to be involved. My example has always been the real world keyword search for someone who is interested in battles of WWII: "wwii battles" which should retrieve the cross-references:
> See:  World War, 1939-1945--Aerial operations.
> See:  World War, 1939-1945--Campaigns.
> See:  World War, 1939-1945--Naval operations.
>
> which appears now only if you search: "World War, 1939-1945 battles" which nobody would ever do. With a structure as you lay out above and what I think is necessary, it is at least possible because there is a reference for "wwii" in http://id.loc.gov/authorities/sh85148273 which appears nowhere else. This structure reflects how the card catalog functioned. I have written some more on this in one of my "Open replies" to Thomas Mann, where I discuss some of the problems of subjects, at: http://eprints.rclis.org/13059/1/OntheRecordOpenReply.pdf
>
> <snip>
> Your basis for this thread was to mitigate the effort and expense of our current cataloging process by ignoring RDA and FRBR and, instead, tweaking AACR2.  But then you ask if we should drop LCSH for dbpedia. These seems completely disjoint.  How would we begin to justify the retrospective conversion?
> </snip>
>
> I do not want that at all. We should be working hard to make LCSH actually useful for the public who now approach information retrieval in ways completely differently from before (primarily, using keyword which, as I tried to show, makes the LCSH browses more or less incoherent). But even more importantly, we must create something that is genuinely useful to our users and this means to *not* merely recreate the functionality of the card catalog, but we should try to recreate its power--because there was a power that is not replicated in our library catalogs (as I have tried to demonstrate) and certainly not in Google and the like. This also shouldn't take 10 years to do.
>
> If it turns out that all we can do is recreate the traditional browses used in the card catalog, I am afraid it may not be worthwhile.
>
> Jim Weinheimer
>
Received on Fri Nov 20 2009 - 03:47:51 EST