Re: The problem with OPACs [was: New subject keyword search]

From: Doran, Michael D <doran_at_nyob> Date: Sat, 28 Jul 2007 15:13:08 -0500 To: NGC4LIB_at_listserv.nd.edu

> Your catalog seems to be entirely keyword searches of
> various kinds (basic and "advanced").

With the Voyager ILS we essentially have a choice of making the "Advanced Search" a guided boolean keyword search, or a combination of left-anchored/keyword search/browses.  Personally, I would prefer the combination of left-anchored search/browses, but the consensus in our library is currently leaning towards the Advanced Search that users see. 

> So when I search for "Civil war" in a basic search, I get 4701 hits,
> mostly on the American Civil War, but also Julius Caesar's Civil War,
> Karl Marx's Civil war in France, books on the Spanish Civil War,
> a Chinese civil war, and others. A lot of stuff to sort through.

Our basic search is all about expected search behavior.  And that's exactly the type of results most people would expect.  I.e. they don't expect the search engine to be a mind reader and know that what they were really looking for was (for example) the Spanish Civil War.  Typically what you, or I, or anybody else would do at that point after seeing all the different results, would be to redo a search on "spanish civil war" [1].  We do that *all* the time in Google/Yahoo/Amazon/whatever when we get too many hits that aren't relevant to what we are looking for.  

The important thing (see below) is that the user *got* results, and those results gave them the information they needed to refine their search.  Note: My *preference* would have been an OPAC search interface with facets (ala VUFind/Primo/et al.) that would have allowed a user a one-click way to narrow the search to Spanish Civil War, but that's not an option with Voyager.   

> Let's say I use your advanced search and type in Spanish civil war,
> as a phrase, in subjects. I get nothing. 

If the intention was to make *my* point, then that has been accomplished.  The exact search (a phrase search for "Spanish civil war" [1]) that returned 120 relevance ranked hits in the default basic search interface *fails* in the advanced subject search, precisely because the user doesn't know the secret handshake of Library of Congress Subject Heading syntax.  

> If I had typed Civil war, "as a phrase," and Spain,
> "all of these words," both in subjects, I would get
> 201 items. But how many users would know you have to
> use Spain on a separate line instead of Spanish civil war?

Again, this is making my exact point!  I whole-heartedly agree that a user shouldn't *have* to know that.  The whole purpose of the UT Arlington OPAC default OPAC search is that they don't need to know that kind of stuff.

> Compare that with a "topic or genre/form search" for Spanish
> civil war on the UCLA archive.

Per my original response, explain to me again how the user knows (sans bibliographic instruction) that he/she needs/wants to use a "topic or genre/form search" (out of the twelve search types) when searching for items on the Spanish civil war?

> There you get a display of headings that show subject
> relationships.

Librarians are *keenly* interested in subject relationships.  Most of our users are not -- they want 'stuff' (preferably online in full text).  Our task is to make the subject relationships work IN THE BACKGROUND (and transparently to the user) to return the results that he/she needs.  That's one of the great things about the faceting model -- do a keyword search, get *hits* (like most people expect) but have the subject facets available off to the side to use, or not, for refining the search.

> Admittedly, the vocabulary takes some getting used to.

You think?  Does that have usability repercussions?

> If you  search directly for "Spanish Civil War," it will appear
> on a browse screen,

If the user was wanting headings, rather than hits, I suppose that could be considered a successful search.

> ... but you have to click on the "more info" button, and it'll
> take you to the established heading, Spain--History--Civil
> War, 1936-1939.

And what leads us to believe that users would even click on the "more info" button?  Even if they do, they get *more* headings, but still no hits.

> But clicking on that will provide a whole array of headings,
> some with subdivisions like aerial operations, artillery operations,
> campaigns (subdivided by place), casualties, evacuation of civilians,
> and many more.

And even *more* headings -- the average user is probably ready to slit their wrists at this point.

> You don't have to examine hundreds of titles to find the
> specific ones you need.

When did our user discover that he/she ws looking really looking for (for example) artillery operations in the Spanish civil war?  If "spanish civil war artillery" is searched in our basic search, it will pull up a small number of relevant items (if we have any, natch) since a keyword anywhere search searches subject headings (and we can field-weight subjects headings high if we choose).  AND it will pull up items that have artillery in the title or table of contents, or notes, or other field, but for whatever reason, was not assigned that LC subject heading subcategory.  The basic keyword search with relevance ranked results not only seems a lot easier, it also fits the model that users are most familiar with.

> Which system is more "user friendly" when you
> factor in that?

I'll leave that for our list readers to decide for themselves.

-- Michael

[1] Without quotes, a UT Arlington Library OPAC basic search on "Spanish civil war" gets 228 hits; within quotes, "Spanish civil war" gets 120 hits. (see http://pulse.uta.edu/)

# Michael Doran, Systems Librarian
# University of Texas at Arlington
# 817-272-5326 office
# 817-688-1926 cell
# doran_at_uta.edu
# http://rocky.uta.edu/doran/

________________________________

From: Next generation catalogs for libraries on behalf of Ted P Gemberling
Sent: Fri 7/27/2007 6:19 PM
To: NGC4LIB_at_LISTSERV.ND.EDU
Subject: Re: [NGC4LIB] The problem with OPACs [was: New subject keyword search]

Michael,
Thanks for the response.

I realize that the UCLA search page is not as "user friendly" as it
could be. Frankly, I don't know what some of the searches are. I
wondered for awhile why there was no author search, and just now
realized that "credits search" retrieves people's names in direct order,
if they are in 5XX notes.
"Credit variants" only picks up names that are in authority records. So
it will pick up Woody Allen (even in direct order) but not Naomi Watts,
since there's no authority in their database for her. You can get hits
for her with the credits search.

Keep in mind, though, that this is a specialized database for people
studying films. So they will probably get bibliographic instruction for
using it. It's not UCLA's general catalog.

Also, it wouldn't be very hard to present the "Topic or genre/form
search" as "subject keyword search (browsable arrays)" alongside
"subject keyword search (titles)." I'm not an expert on labels, so I
don't know how good those are, but it does seem to me that this would be
a relatively easy thing to solve by labeling.

Your catalog seems to be entirely keyword searches of various kinds
(basic and "advanced"). So when I search for "Civil war" in a basic
search, I get 4701 hits, mostly on the American Civil War, but also
Julius Caesar's Civil War, Karl Marx's Civil war in France, books on the
Spanish Civil War, a Chinese civil war, and others. A lot of stuff to
sort through. If I do civil war in an advanced search, looking for words
in subjects, I get 3001 hits.

Let's say I use your advanced search and type in Spanish civil war, as a
phrase, in subjects. I get nothing. If I had typed Civil war, "as a
phrase," and Spain, "all of these words," both in subjects, I would get
201 items. But how many users would know you have to use Spain on a
separate line instead of Spanish civil war?

Compare that with a "topic or genre/form search" for Spanish civil war
on the UCLA archive. There you get a display of headings that show
subject relationships. Admittedly, the vocabulary takes some getting
used to. If you  search directly for "Spanish Civil War," it will appear
on a browse screen, but you have to click on the "more info" button, and
it'll take you to the established heading, Spain--History--Civil War,
1936-1939. But clicking on that will provide a whole array of headings,
some with subdivisions like aerial operations, artillery operations,
campaigns (subdivided by place), casualties, evacuation of civilians,
and many more. You don't have to examine hundreds of titles to find the
specific ones you need. Which system is more "user friendly" when you
factor in that?

I'm puzzled as to why you don't provide any alphabetical (heading)
searches on your catalog when Voyager provides that possibility. They
are not options at all, as near as I can see.

Ted Gemberling
UAB Lister Hill Library
(205)934-2461

-----Original Message-----
From: Next generation catalogs for libraries
[mailto:NGC4LIB_at_LISTSERV.ND.EDU] On Behalf Of Doran, Michael D
Sent: Thursday, July 26, 2007 3:48 PM
To: NGC4LIB_at_LISTSERV.ND.EDU
Subject: Re: [NGC4LIB] The problem with OPACs [was: New subject keyword
search]

> I'm sorry for my outburst about "libraries not being
> educational institutions." That is caricaturing what you
> said.

No offense taken.

> But have you tried the search interface Yee sent?
> http://cinema.library.ucla.edu <http://cinema.library.ucla.edu/> 

Yes, I have.

Now, I'm going to put my "usability" hat on and ask y'all to do the
same.  Compare the UCLC Film and Television Archive default OPAC search
interface (http://cinema.library.ucla.edu <http://cinema.library.ucla.edu/> ) to our University of Texas at
Arlington Library's default OPAC search interface
(http://pulse.uta.edu/).  They are both the same Voyager ILS, so can
stand comparison as to the search interface itself (if not the data).

Look at the search tab labels.  Compare "Recommended Searches" and
"Complex boolean keyword and cross index searching with index
specification" to "Search" and "Advanced Search."  Which do you think
most users would intuitively understand and know when to use one and
when to use the other?  Usability tells us that we don't want the user
to be expending the all their time figuring out how to use the *tool*;
if designed right, they should be able to (figuratively) just pick up
the tool and start using it.

On the UCLA "Recommended Searches" search tab, a user has to decide what
type of search to do (among twelve possibilities) and the reason that
interface is loaded with search tips/help, is that for most of the
choices, the user has to know one of the secret handshakes, vs. the UTA
"Search" tab where no search type choice is necessary.

In the UTA "Search" tab search:
        words within quotation marks are treated as a phrase
        words outside of quotation marks are automatically boolean ANDed
        the search is a keyword everywhere search [1]
        the results are relevance ranked
        either an asterisk ("*") or a question mark ("?") can be used
for truncation

Author names can be searched last name first, or first name first.
Titles can be searched with, or without, a beginning article.  ISBNs can
be searched.  Subject headings can be searched.  Same box, same search
button.  Pretty much any search will pull up relevant hits.  No secret
handshakes required.

If that behavior sounds Google-like, it's because that was the
intention.  <important>NOT because Google is the be-all and end-all of
search functionality, but because Google SETS THE USER EXPECTATIONS
ABOUT HOW A SEARCH INTERFACE SHOULD WORK.</important>  When a search
interface behaves like most users expect it to behave, you don't need to
put all the search tips that explain why they got a "No hits" response
when they searched on "Mark Twain" or "The Color Purple".

Does this approach mean we give up some search precision?  Yes it does.
Are most user's already acclimatized to that with the other search
interfaces they use?  Yes they are.

I would argue that there are also 'discovery' benefits with this search
approach: A user searching on "hamlet shakespeare" will not only find
hits for the play, they will also find works *about* the play.

Are the more precise search types important?  <important>Yes they are --
I'm just saying, move them to the advanced search and for the default
search give users a tool that they can pick right up and start
using.</important>

> The "topic or genre/form" search option provides a very
> user-friendly way for people to get familiar with controlled
> vocabulary.

User-friendly?  How are they going to know to pick that search type out
of the twelve search types listed.  Realistically, how many people do
you think are actually going to choose that search?  If you snatched a
student off of the quad and asked "In the context of an online search of
a Film and Television Archive, what do you think a 'topic or genre/form'
search is?  What do you think is being searched and what type of results
would you expect?"  What kind of answer (if any) do you think you would
get?  Just choosing a search type is going to leave a lot of users
scratching their heads: "What's a credit variance search?"  "I kind of
know what a call number is, but what's an Inventory number and why and
when would I search that?"  "How about a 'pre-existing works' search?"
"What's a holdings search?"  "Or a SPAC search?"  What does xref mean to
the *average* user.

Most users would be saying to themselves, "Why can't I just enter the
search terms I want and hit the search button?"  The same way that they
do in many, if not most, of the search interfaces outside of
library-land.

I'm sorry, but the UCLA Film Archive OPAC is a user interface designed
for librarians, not for end users.  Now that may be what UCLA wants and
needs, but I'm not buying that it is user-friendly.  I would put all
that stuff in an advanced search.

Please re-read the <important></important> parts before flaming!  :-)

-- Michael

[1] In the Voyager ILS, we have some control over relevancy ranking with
field weighting

# Michael Doran, Systems Librarian
# University of Texas at Arlington
# 817-272-5326 office
# 817-688-1926 mobile
# doran_at_uta.edu
# http://rocky.uta.edu/doran/

> -----Original Message-----
> From: Next generation catalogs for libraries
> [mailto:NGC4LIB_at_LISTSERV.ND.EDU] On Behalf Of Ted P Gemberling
> Sent: Thursday, July 26, 2007 12:22 PM
> To: NGC4LIB_at_LISTSERV.ND.EDU
> Subject: Re: [NGC4LIB] The problem with OPACs [was: New
> subject keyword search]
>
> Michael,
> I'm sorry for my outburst about "libraries not being
> educational institutions." That is caricaturing what you
> said. But have you tried the search interface Yee sent?
> http://cinema.library.ucla.edu <http://cinema.library.ucla.edu/> 
>
> The "topic or genre/form" search option provides a very
> user-friendly way for people to get familiar with controlled
> vocabulary. As I pointed out to someone on Autocat, if you
> search for world war battles or battles world war, you get
> these terms:
>
> World War, 1914-1918--Battles, sieges, etc.
> World War, 1939-1945--Battles, sieges, etc.
>
> Those are all see references (terms on authority records that
> are not the established forms). Clicking on the "more info"
> buttons by them, these established headings came up:
> World War, 1914-1918--Aerial operations.
> World War, 1914-1918--Campaigns.
> World War, 1939-1945--Aerial operations.
> World War, 1939-1945--Campaigns.
> World War, 1939-1945--Naval operations.
>
> When you click on the first of those, it is further expanded to:
> World War, 1914-1918--Aerial operations.
> World War, 1914-1918--Aerial operations, American--Drama.
> World War, 1914-1918--Aerial operations, British--Drama.
> World War, 1914-1918--Aerial operations--Caricatures and cartoons.
> World War, 1914-1918--Aerial operations--Drama.
> World War, 1914-1918--Aerial operations, German--Drama.
>
> They take you directly to the titles.
>
> It seems the problem with the position you are advocating is
> that you're missing the distinction between controlled and
> uncontrolled vocabulary.
> If I do a keyword search for world war battles from that same
> search page, I get 18 hits. 18 titles. But there is no
> indication of how the various hits relate to each other as
> subjects. A person has to laboriously go through each one,
> and probably a significant number will not be what she's looking for.
>
> What you said about standardization of design is well taken.
> I don't own a car but rent them occasionally, and it is
> annoying when I can't figure out where the lever for opening
> the gas cap and such like are. But the acquisition of
> knowledge is, I think, a more complex process than driving a
> car. Driving a car is expected to be habitual: once you learn
> the task, it's supposed to be something that requires little
> or no thought. Research isn't that kind of thing. The
> question is, do we want to transfer the effort and expense of
> organizing information entirely to the users, or do we want
> to continue to do some of it for them?
>
> I imagine Selden is right about the transaction logs in her
> system. But more research needs to be done on the research
> behavior and needs of scholarly users. Those are the people
> we want our freshmen to grow into, and if we remove tools
> they need, it may seriously impair our ability as a society
> to produce high-quality scholarship.
>
> Ted Gemberling
> UAB Lister Hill Library
> (205)934-2461
>
>
>
> -----Original Message-----
> From: Next generation catalogs for libraries
> [mailto:NGC4LIB_at_LISTSERV.ND.EDU] On Behalf Of Doran, Michael D
> Sent: Thursday, July 26, 2007 9:32 AM
> To: NGC4LIB_at_LISTSERV.ND.EDU
> Subject: [NGC4LIB] The problem with OPACs [was: New subject
> keyword search]
>
> >  Selden Deemer wrote:
> >
> > Whatever the considerable benefits of browse displays (I read, and
> > took to heart Thomas Mann's comments), the fact remains
> that, when I
> > look at our search log stats, users (as opposed to
> librarians) simply
> > do NOT browse (and it's not for lack of instruction).
>
> I'm convinced that the underlying "problem" with our OPACs
> (from a usability perspective) is that they are sold once to
> librarians, rather than many times to end users.  If each
> user was making an individual purchase decision, OPACs would
> have quickly evolved to meet their needs.
> I believe ILS vendors (who we often unfairly blame) are quite
> capable of producing an awesome OPAC.  But the vendors are
> building OPACs to meet our (i.e. librarians) perceived needs,
> because vendors are smart and are in business to make money
> and they understand that *we* are the ones writing that big
> check every 10-15 years or so.  As Selden points out, OPAC
> features that are important/essential to us, are often ones
> that our users could care less about, despite all our
> well-meaning instruction.
>
> And that is assuming that OPAC functionality/usability is
> even a prime consideration in the purchase decision of an
> ILS.  Very often that's not the case, as acquisitions,
> cataloging, or circulation module features drive the decision
> and the OPAC is an afterthought.  If we want to find out
> who's responsible for sucky OPACs, the first place we need to
> look is in the mirror [1].
>
> On the bright side, products like VUFind, Primo, AquaBrowser,
> and Endeca unbundle the OPAC from the ILS, giving us a chance
> to atone for past ILS purchase decisions (which can't easily
> be undone).  One of the problems inherent in an ILS-bundled
> OPAC is that the 10-15 year (give or take) ILS replacement
> cycle does not allow for significant changes to what quickly
> becomes a calcified code base.  I'm particularly excited
> about Andrew Nagy's recently released open-source OPAC; with
> VUFind, the library-land development community has a golden
> opportunity to craft an OPAC that genuinely meets our users
> needs.  However, doing so will require that we resist the
> temptation to create the ideal OPAC for *librarians*, but
> instead focus on creating on OPAC that meets our
> *users'* search needs.  I think that would be an OPAC that
> doesn't require instruction (however well-meaning) or require
> an initial search page that is 80% search tips.
>
> Just my opinion...
>
> -- Michael
>
> [1] Karen Schneider asks: "But the interesting questions are:
> Why don't online catalog vendors offer true search in the
> first place? and Why [don't we] demand it? Save the time of
> the reader!"  I would answer that vendors don't offer it, and
> we don't demand it, because the ILS (OPAC) check-writers have
> other priorities.
> See: Karen Schneider, How OPACs Suck, Part 1
> http://www.techsource.ala.org/blog/2006/03/how-opacs-suck-part
-1-relevan
> ce-rank-or-the-lack-of-it.html
>
> # Michael Doran, Systems Librarian
> # University of Texas at Arlington
> # 817-272-5326 office
> # 817-688-1926 mobile
> # doran_at_uta.edu
> # http://rocky.uta.edu/doran/
>