Re: Yes but

From: Ted P Gemberling <tgemberl_at_nyob>
Date: Wed, 9 May 2007 12:23:50 -0500
To: NGC4LIB_at_listserv.nd.edu
Casey, Kristin, and everybody,
I was impressed with the sort of interface Casey showed. I couldn't tell
for sure whether it was an Endeca implementation, but it looked like
one. It does seem that a major way we can link keyword searching with
controlled vocabulary is the way Endeca does it, by leading people to
the vocabulary via particular titles. I think that's the way people have
always done research: by finding one or two things they wanted and then
looking for others similar to them in various ways. Or at least that's
the way I always did it.

I do think, though, that this is a somewhat different use of the term
"faceted" than what I believe is traditional. I think traditionally,
"faceted subjects" meant subjects that have been "decomposed" into
separate logical elements that are extremely general, so that it's easy
to move from one combination of the elements to another. I suppose maybe
the "decomposition" is present on Lamson's Endeca-type page with the
distinction between subjkey, author, format, and Meta. But the subjects
themselves in subjkey are not faceted in that traditional sense.

That leads to another point in response to Jonathan. If we're going to
pursue making LCSH more "faceted," I think we should consider the way
the National Library of Medicine does it rather than FAST. Lois Mai Chan
herself has said FAST has a serious problem because when you decouple
subfield x's from their main headings, they often become kind of
meaningless. As someone pointed out at ALA last year, if a book has
these headings in LCSH:
600 10 Shelley, Percy Bysshe, 1792-1822--Philosophy
600 00 Plato--Influence

(obviously a book on Plato's influence on Shelley) in FAST it will be:
600 10 Shelley, Percy Bysshe, 1792-1822
600 00 Plato
650    Philosophy
650    Influence

"Influence" is pretty meaningless, I think, and Philosophy is too
general. This is not a general book on philosophy. NLM's Medical Subject
Headings, as implemented in their own catalog (http://locatorplus.gov/)
keep the subfield x's with the main headings. They only break out the
geographic and form subdivisions into separate 65X's. This is not to say
LCSH should entirely copy MeSH, since MeSH doesn't have subdivisions
like philosophy or influence, but the general plan of MeSH is better
than FAST.

I appreciate the work Kristin and others have done with Endeca, but one
area where I might disagree a bit is on the value of subject browse
screens. As Thomas Mann has pointed out, browse screens do a lot to help
a researcher figure out the range of related subjects. The subject
results go beyond what the searcher might have known on his own and are
informative in themselves. That is a sort of "faceting" in a sense, in
that the screens allow a searcher to see a basic term conjoined with
various others. He tells of showing a researcher how doing a subject
search for Yugoslavia rather than a keyword search allowed her to see a
great variety of headings such as "... Antiquities (including
"Antiquities, Roman" and others), "... Armed forces--History," and many,
many others. (I am grateful that Endeca didn't eliminate these browse
screens while emphasizing keyword searches).

One last comment to Kristin. She lamented that a subject search on her
library's catalog for "Educational sociology" only gives 618 hits, while
a keyword search for "sociology of education" gives 848. There is
perhaps also the added discrepancy that the Endeca "facet" list only
gives 388 hits when you do the keyword search for sociology of
education. But to tell you the truth, I consider that a rather small
problem. I think this kind of goes to the heart of the "seamless" issue.
What's wrong with a patron having to go to a library staff member or
reference librarian and ask how to find more resources? Why should we go
to such links to do all the work for patrons? Another thing to keep in
mind is that in attempting to do all that work, we may also be working
ourselves out of our jobs!
     --Ted Gemberling

-----Original Message-----
From: Next generation catalogs for libraries
[mailto:NGC4LIB_at_LISTSERV.ND.EDU] On Behalf Of Kristin Antelman
Sent: Tuesday, May 08, 2007 4:30 PM
To: NGC4LIB_at_LISTSERV.ND.EDU
Subject: Re: [NGC4LIB] Yes but

Casey,

You're right that exposing subject vocabulary in facets can lead users
to the correct vocabulary, but in the current faceted catalogs it's hit
or miss on recall using that link to the correct term, because your
initial retrieval set was created from a keyword search.  At NCSU, we
call this the "revolutionary war" problem (user searches "revolutionary
war" while correct heading is "United States--History--Revolution,
1775-1783").

Your example, sociology of education, demonstrates the problem.  In our
catalog, a keyword search on that term gives 861 hits, and the facet
linking to the correct term, educational sociology, has 388 hits.  If
you look in our LCSH browse index, however, you will find 618 items with
the heading "educational sociology" and its associated 81 subheadings.
So the keyword searcher is not seeing 230 items with the exact heading
they were searching for.

What we need is a link from the user's keyword search to a
keyword-in-subject phrase search on the correct heading.  You can
simulate that in our catalog by searching "educational sociology" as a
keyword-in-subject search, where you will get all 618 items-- with
facets available for futher refinement.  If we could do this, the user
experience (apart from the issue of not knowing the vocabulary to enter
the LCSH Browse world) would be much better than having to page through
5 screens of 82 subheadings of "educational sociology" in the Browse
index.  We are looking at several approaches to using the cross
references in the authority files to accomplish this.

Of course, this problem is not a problem with faceted navigation, but of
keyword searching.  The faceted navigation interface does, I think, lead
to a false sense that the catalog is making the connection between
keyword and controlled searching, when in many cases it's very much a
partial connection.  (e.g., The lost recall for a user in our catalog
searching "causes of the revolutionary war" is over 99%: 3 hits starting
w/kw search and navigating to the correct heading vs. 388 books w/the
correct heading).

-Kristin

Casey Bisson wrote:
> Ted,
>
> This is an excellent example.
>
> I often ask people if they know what "bagged products" are, and the
> usual answer is "huh?" Then I offer this picture (link below) and
> watch as people immediately understand the term.
>
> http://maisonbisson.com/blog/post/11538/
>
> I'm an advocate for kind of controlled vocabularies you describe here,
> but I've also seen how we can represent them in our systems in ways
> that help the user make better sense of them.
>
> Example: I often see "sociology of education" appear in our search
> stats, while the correct LCSH is "educational sociology." Clearly
> there's a huge number of users at my library that don't know the LCSH,
> but they still need good results. My solution (and it's old hat by
> now) was to display the aggregate subjects as a facet.
>
> http://plymouth.edu/library/opac/search/sociology+of+education
>
> And using your examples, the subject facets again reveal some very
> useful information:
>
> http://plymouth.edu/library/opac/search/eskimo
> http://plymouth.edu/library/opac/search/inuit
>
> The challenge I'm trying to meet is to provide sophisticated results
> without increased complexity. The subject facets reveal what the
> catalog knows (based on what librarians have acquired and the metadata
> they have) about the keywords the user searched. We know from previous
> studies that users modify their searches based on the results
> returned, and I've seen lightbulbs appear in users as the explore the
> facets.
>
> The result is that a user who didn't know the LCSH before starting a
> search learns it quickly.
>
> That is, sophisticated tools can make complex research easy.
>
> Now one of the things I'd like to see is tooltips for the LCSH facets
> that offer a deeper explanation of what they are (and are not).
>
> Notes:
> 1: the code serving the above links is over a year old and is
> embarrassing, but it's got the largest collection of relevant items.
> For a more interesting and up to date example of Scriblio (was WPopac)
> see http://beyondbrownpaper.plymouth.edu/browse/ .
> 2: my library's collection doesn't come close to serving the needs of
> somebody researching "Judaism and the difference between its concepts
> of Messiahship and those of Christianity," the first example in your
> original message.
>
> --Casey
>
>
> On May 8, 2007, at 3:15 PM, Ted P Gemberling wrote:
>
>> Here's another example that shows the important role of librarians as
>> information "experts." A lot of people today are under the impression
>> that "Inuit" and "Eskimo" are equivalent terms. Generally Inuit is
>> considered more appropriate to use. NLM's Medical Subject Headings
>> accept that equivalence and establish Inuit as the term. But if you
look
>> at the LCSH hierarchy, you find that Eskimo is actually a broader
term
>> than Inuit. Here's the scope note for Inuit:
>>
>> "Here are entered works limited to the indigenous Arctic peoples of
>> Greenland, Canada, and northern Alaska. Works discussing collectively
>> the Inuit peoples and the related Eskimo peoples of southern and
western
>> Alaska and adjacent regions of Siberia, or works for which the
>> individual group cannot be identified, are entered under ǂa Eskimos."
>>
>> Probably 70-80% of all Eskimos in the world are Inuits, but having
spent
>> one summer in Western Alaska, I'm aware there is another 20-30% who
are
>> Yupiks. The only term we have for both groups is Eskimos. This shows
the
>> close collaboration LCSH subject specialists have with people with
>> knowledge of subject areas. Just looking at the LCSH syndetic
structure
>> is informative for a researcher. Keywords cannot provide that
>> information without a lot more work on her part.
>
>
> Casey Bisson
> __________________________________________
>
> Information Architect
> Plymouth State University
> Plymouth, New Hampshire
> http://oz.plymouth.edu/~cbisson/
> ph: 603-535-2256

--
________________________________________
Kristin Antelman
Associate Director for the Digital Library
NCSU Libraries
Box 7111
Raleigh, NC 27696-7111
(919) 515-7188 Fax (919) 515-3628
Received on Wed May 09 2007 - 11:23:33 EDT