Re: Leveraging Authority Data in Keyword Searches

From: Jacobs, Jane W <Jane.W.Jacobs_at_nyob>
Date: Tue, 5 May 2009 12:54:24 -0400
To: NGC4LIB_at_LISTSERV.ND.EDU
I've been following all the responses to my original post with great
interest.

 

I find the most interesting thing is that there has been so little
experimentation with a concept which, to me, seems so obvious, I
wondered why I hadn't worked it out sooner myself. (But then I've been a
cataloger for so many years that it undoubtly re-wired my brain.)

 

The only two actual implementations I saw were Evergreen and Endeca.
Did I miss anything/one?

 

Mike Rylander wrote:

*       >That's what we've done from the beginning with Evergreen as
well.  For example:

 

*
http://gapines.org/opac/en-US/skin/default/xml/rresult.xml?rt=subject&tp
=subject&t=JFK
<http://gapines.org/opac/en-US/skin/default/xml/rresult.xml?rt=subject&t
p=subject&t=JFK> 

 

*       See the section at the bottom of the result list labeled "You
may also like to try these related searches".  You'll notice that
there's also a listing for John F. Kennedy International Airport.
That's because we do a left-anchored authority comparison instead of an
exact match.

 

*       We believe that follows the principal of least surprise for the
user by not magically changing their input, but notifying them that
there's probably more related material available.  

 

I liked the disambiguation aspect here (reminds me a little of
Wikipedia) and the theory seems quite sensible.  However, I don't why
I'd want to search BOTH JFK Assassination and Assassination of JFK and,
in fact, I probably wanted: 

Kennedy, John F. (John Fitzgerald), 1917-1963 which isn't one of my
choices.

 

And Elizabeth Yager Simpson wrote:

*       The State University Libraries of Florida are using an Endeca
catalog - http://uf.catalog.fcla.edu/uf.jsp .  Recently, we implemented
a "Did you mean ... " link for title, author and subject keyword
searches.  The system searches the user's term in the authority tables
and if there's a cross-ref match, it displays a "Did you mean ... " for
the authorized term.  Instead of populating the user's search results
with hits based on the authorized term, the system allows the user to
click and do a new search if desired.  

 

State University Libraries of Florida offers me:

Did you mean John F. Kennedy International Airport?


but, inexplicably fails to give me JFK himself.  I say inexplicably
because JFK IS a cross-reference to Kennedy, John F. (John Fitzgerald),
1917-1963.  Of course I don't know the exact algorithm they are using,
but I'd like to!

 

It seems clear that using authorities to expand searches is confusing to
the customer if it is not explained at some level. Users probably want
or expect more precision from the library catalog than they get from
Google.  It's interesting to note that Google has chosen to use the "Did
you mean?" explanation instead of a straight redirect.  I'm guessing
they based that on some reasonable study of customer behaviors which
would endorse my premise.  

 

I am not wild about the wording "Did you mean?" in the context of a
controlled vocabulary.  Chances are the customer meant exactly what they
typed whether we, librarians, or LC chose a different form or not.  The
possibilities seem to me to fall into two categories, variant spellings
and variant or related terms. 

 

 

Hence I'm imagining something like:

 

Subject Keyword Search: Cockateel

 

Would you like to search Related Terms and/or Variant Spellings?

 

Cockatiel 

 

I'd start with subject headings.  This is where a controlled vocabulary
will really botch up a keyword search if the user initially chooses the
wrong term. Right now a keyword search as above takes you exactly
nowhere in our catalog, and yet we have more than twenty titles about
Cockatiels. 

 

In the State University Libraries of Florida's (Endeca with "Did you
mean?" search implemented - http://uf.catalog.fcla.edu/uf.jsp) we get:

Did you mean cocktail
<http://uf.catalog.fcla.edu/uf.jsp?Ntt=cocktail&N=20&S=2531241539790343&
Ntk=Subject&in_dym=1&Ntpr=0&Nty=1&Ntpc=1> ?

 

Did you mean Cockatiel
<http://uf.catalog.fcla.edu/uf.jsp?Ntt=+Cockatiel&N=20&S=253124153979034
3&Ntk=Subject&in_dym=1&Ntpr=0&Nty=1&Ntpc=1> ?

 

Well, I'm not sure where they got "Cocktail" and no, that's not what I
meant at all.  Still I must say that it's a whole lot better than:


No Results Found.


No records were returned.

 

It also works for "Cockatiels" (which is what I actually would have
searched, if I had been a real customer who did subject keyword
searches, instead of a cataloger with a re-wired brain who only does
browse searches)  "Cockatiels" is NOT one of the LC cross-references so
clearly Endeca added some artificial intelligence to the "Did you mean"
which provides both a dead on result (cockatiels -> cockatiel) and a
slightly wonky one (cockatiel -> cocktails).  

 

Overall though, I like it a lot.  I want to bang on it some more to see
where and, if so, why it might give wonky results, but it seems like a
quantum leap forward.

 

JJ  

 

**Views expressed by the author do not necessarily represent those of
the Queens Library.**

 

Jane Jacobs

Asst. Coord., Catalog Division

Queens Borough Public Library

89-11 Merrick Blvd.

Jamaica, NY 11432

tel.: (718) 990-0804

e-mail: Jane.W.Jacobs_at_queenslibrary.org

FAX. (718) 990-8566

 

 

-----Original Message-----
From: Next generation catalogs for libraries
[mailto:NGC4LIB_at_LISTSERV.ND.EDU] On Behalf Of Tim Spalding
Sent: Monday, May 04, 2009 10:59 AM
To: NGC4LIB_at_LISTSERV.ND.EDU
Subject: Re: Leveraging Authority Data in Keyword Searches

 

> Is this just about the phrase used?  Should it say "See also" instead
of "did you mean"?  Or can anyone think of another phrase that could be
used to label the (legitimately useful, I think?) functionality without
being odd?

 

I think these problems are inherent in controlled vocabulary, and to

the extent that users are used to the idea of *telling* search engines

what they want, *being told* that what they want is wrong will irk and

confuse people.

 

Take "Nat Turner" or his rebellion. Use the UFL catalog and you're

asked "Did you mean the Southampton Insurrection?" I'd say "Well, no,

I didn't *mean* that. I meant Nat Turner. *You* mean it, and you want

me to mean it too, because your mental model of the world is smaller

than mine, and you need me to think like you."

 

I'm not sure how to solve it. I might take the top X subjects and put

them at the top as topics that may be of interest, or something. Or

you could take true equivalencies and silently redirect after

click-"Redirected from Partial Birth Abortion" like Google does.

 

Incidentally, I think it was a righteous piece or programming, and

could prove useful. I just think it exposes the problems of the

underlying data.

 

Tim

 

On Mon, May 4, 2009 at 8:53 AM, Jonathan Rochkind <rochkind_at_jhu.edu>
wrote:

> ________________________________________

> From: Next generation catalogs for libraries [NGC4LIB_at_LISTSERV.ND.EDU]
On Behalf Of Tim Spalding [tim_at_LIBRARYTHING.COM]

> Sent: Monday, May 04, 2009 2:27 AM

> To: NGC4LIB_at_LISTSERV.ND.EDU

> Subject: Re: [NGC4LIB] Leveraging Authority Data in Keyword Searches

> 

> May I suggest there is something Orwellian about the phrase "did you

> mean?" when applied to politically-charged terms? It suggests to me

> not merely that resources can be found under a given heading, but that

> the searcher's own term is invalid and wrong.

> 

> Many may feel comfortable delegitimizing "partial birth abortion" and

> "socialized medicine" but-to take a few examples from Sanford

> Berman-UFL still has two items with the LCSH "Yellow peril." 137 under

> "Jewish Question" and 2 under "Catholics as Scientists."

> 

> So far, I haven't been able to make the catalog suggest I'm after

> information on the Yellow Peril, though, and indeed when I search for

> the phrase, it suggests I mean "Walleye (Fish)." Are you mining your

> own subject headings or doing it against a list of currently-approved

> ones?

> 

> Tim

> 

> On Fri, May 1, 2009 at 5:52 PM, Simpson,Elizabeth Yager

> <betsys_at_uflib.ufl.edu> wrote:

>> The State University Libraries of Florida are using an Endeca catalog
- http://uf.catalog.fcla.edu/uf.jsp .  Recently, we implemented a "Did
you mean ... " link for title, author and subject keyword searches.  The
system searches the user's term in the authority tables and if there's a
cross-ref match, it displays a "Did you mean ... " for the authorized
term.  Instead of populating the user's search results with hits based
on the authorized term, the system allows the user to click and do a new
search if desired.

> 

 

 

 

-- 

Check out my library at http://www.librarything.com/profile/timspalding
Received on Tue May 05 2009 - 12:56:23 EDT