Re: Leveraging Authority Data in Keyword Searches

From: Jonathan Rochkind <rochkind_at_nyob>
Date: Mon, 4 May 2009 22:14:49 -0400
To: NGC4LIB_at_LISTSERV.ND.EDU
Ha, so, actually, compare to wikipedia.  wikipedia is also essentially a topical controlled vocabuarly, and an individual article ends up needing one particular primary label. 

In Tim's example, wikipedia picks the term that Tim (and I) happen to prefer. "Nat Turner's [Slave] Rebellion" is likely the mainstream consensus preferred term now, and LCSH should probably be updated to recognize that (which is quite possible to do; that most of our particular ILS's make it cumbersome to respond when LC does that is our software's fault, and MARC's fault, but not the controlled vocabularies fault). 

But Tim's main point isn't really about lag in catching up with mainstream consensus changes in terminology, right?  But just that different people will think of things differently. Is there someone else irked that Wikipedia's conception of the world is smaller than there's because THEY want it to be the Southampton Insurrection, and damn wikipedia for needing them to think like wikipedia!

Perhaps. But they get over it and use the tool. 

---
http://en.wikipedia.org/wiki/Southampton_Insurrection
Nat Turner's slave rebellion
  (Redirected from Southampton Insurrection)
---

Wikipedia's redirections to canonical article titles are QUITE analagous to LCSH's headings and lead-in terms.  Wikipedia is probably better than LCSH at using the current mainstream consensus heading instead of an outdated one. Of course, wikipedia isn't all that old. It will be interesting to see wikipedia in 40 years, and how they handle changes in consensus vocabularly for concepts they have an article for that's existed for 40 years with another title. But wikipedia, if it's around in similar form in 40 years, probably will be pretty good at changing 'primary' titles and using redirections. 

LCSH maintenance procedures definitely can and should be modified to be more nimble, taking more advantage of volunteer labor. 

But Tim's point wasn't _really_ about some LCSH headings being in need of updating, but that LCSH _inevitably_ needs to pick ONE title at any given time as a preferred heading, with the rest just being lead-ins. Interesting to see that wikipedia is subject to that exact same thing too, one title needs to be picked as the actual canonical article title, with the rest being redirections (possibly from a dis-ambiguation page).

Maybe there's some way to use the way wikipedia presents these things as a model for a library catalog? Use the phrase "Redirected from" on screen somehow? Not sure though, it's not exactly analagous, the things indexed in a library catalog are different in significant ways from the things indexed in wikipedia. 

[Out of curiosity, Tim, are you irked when wikipedia insists on "redirecting [you] from" what you REALLY want to call the topic, because wikipedia's world is smaller than yours and damn wikipedia for making you use it's mental model!]

________________________________________
From: Next generation catalogs for libraries [NGC4LIB_at_LISTSERV.ND.EDU] On Behalf Of Jonathan Rochkind [rochkind_at_JHU.EDU]
Sent: Monday, May 04, 2009 9:52 PM
To: NGC4LIB_at_LISTSERV.ND.EDU
Subject: Re: [NGC4LIB] Leveraging Authority Data in Keyword Searches

________________________________________
From: Next generation catalogs for libraries [NGC4LIB_at_LISTSERV.ND.EDU] On Behalf Of Tim Spalding [tim_at_LIBRARYTHING.COM]
Sent: Monday, May 04, 2009 10:59 AM
To: NGC4LIB_at_LISTSERV.ND.EDU
Subject: Re: [NGC4LIB] Leveraging Authority Data in Keyword Searches

> Is this just about the phrase used?  Should it say "See also" instead of "did you mean"?  Or can anyone think of another phrase that could be used to label the (legitimately useful, I think?) functionality without being odd?

> I think these problems are inherent in controlled vocabulary, and to
> the extent that users are used to the idea of *telling* search engines
> what they want, *being told* that what they want is wrong will irk and
> confuse people.

But people really appreciate Google's "did you mean", or at least it doens't irk them, right? Or are you suggesting that Google's feature really does irk people, but Google hasn't realized it yet and leaves it in anyway?

I know I appreciate it.

I know, Google's not using controlled vocabulary.  But what's that got to do with users being used to the idea of teling search engines what they want, not being offered suggestions?  I think users are quite used to Google's suggestions; is there reason to believe they don't appreciate them as much as I thought they did?

Controlled vocabularly is a REALLY useful tool in retrieval, but it's not really useful because of the _particular_ labels used for a term, some of those labels could profitably be changed. But the other option is instead of saying "did you mean" (or equivelent), just _automatically_ expanding the query to controlled vocab terms that had "lead in" terms matching the query.

I tend to think that automatically expanding a users query is _worse_ than giving the user the -option- to expand a query. Although in some sense Google does that too -- like when you get a page in your results that didn't seem to have your query in it _anywhere_, but probably was in your results because a link TO that page had your search terms in it. That's the closest analogy I can think of to automatically expanding a search query based on c.v. lead in terms. Except even in Google it often leads to annoyance to me when I use it, and it would happen much more in a library c.v. environment.

I think it is WAY premature to conclude that there's no way to powerfully use controlled vocabularies without "irking" users. There are SO many things that can be done with them that have not yet been significantly tried or tested.




"Take "Nat Turner" or his rebellion. Use the UFL catalog and you're
asked "Did you mean the Southampton Insurrection?" I'd say "Well, no,
I didn't *mean* that. I meant Nat Turner."

So, okay, we'll find a different way to present it. I saw old catalogs that used some variation of "Nat Turner is referred to as <a>'Southampton Rebellion'</a> in this catalog"   There could potentially also be a way to do it that offered to show the user the records indexed under that vocabularly -term- without actually even telling the user it has a different 'primary' label. But I need to think about that more.

No doubt that will still irk Tim, but them's the breaks. I suspect most users are not as easily irked by library weirdness as us library geeks, they just go with the flow. And LCSH _does_ change headings from time to time to match new consensus vocabularly. But with such a large vocabulary, it will never be as current as some would like.  Ideally an individual library (or group of libraries) would be able to choose their own 'preferred' heading for a term in their local systems without having to wait for LCSH (and ideally LCSH itself could be more nimble through the right kind of 'crowd sourcing' techniques).  But that doens't get around the issue that different people or communities may simultaneously have their own preferred terms for a concept.

But the use of a controlled vocab like LCSH is really in the _existence_ of a 'term', and the things indexed to it. 'Term' isn't the best word, becuase it's not really about the English phrase used, which can be changed (or presented in languages other than English), but the particular element in the C.V., with an authority number, that individual works have attached to it.  Everyone is welcome to use whatever actual English phrase they like to refer to the same person place or thing -- and they are all served by that event having a place in a topical c.v. so the set of works on that topic can be retrieved

Perhaps there's some clever way to make use of a topical C.V. without ever having to pick just ONE "preferred" term to show to the user -- but I can't think of one off hand that wouldn't end up being awfully confusing.   But perhaps there's a way to have the interface make it clear that nobody is trying to get you to 'mean' anything, we are just trying to help you find stuff on topics you're interested in, which you are welcome to think of or label however you like.

Jonathan


"*You* mean it, and you want me to mean it too, because your mental model of the world is smaller than mine, and you need me to think like you."

No way man, we don't care you how you think or what you mean --- there's a particular event in history, and you are welcome to think of it however or by whatever label you want. We just have things about that event indexed under a controlled vocabulary term, and would like to show it to you.  If you _really_ think the maintainers of LCSH care about trying to make everyone have the same political understanding of events... then I wonder how you got that idea.
Received on Mon May 04 2009 - 22:16:19 EDT