Re: Aqua Browser in beta at U. Chicago

From: Stephens, Owen <o.stephens_at_nyob> Date: Fri, 21 Dec 2007 10:44:23 -0000 To: NGC4LIB_at_listserv.nd.edu

I was using the terminology as presented by AquaBrowser. Although I
understand the point, I think it would be fair to say that 'relevance
ranking' has become a term that would be generally recognised, and so
there is justification for using it in library search engines - Google
doesn't go round explaining that when it says 'relevance' it means that
based on the information you have given it, and the information it has
from other sources, the results are sorted in an order that increases
the likelihood of the results that you want appear closer to the top of
the list of results, and I suspect that the audience is not going to be
very receptive to this type of explanation - but I don't really want to
get back into the 'what is good enough' debate here.

With AquaBrowser I do have an issue that what it calls 'relevance' sort
order in the facets seems to be based on the number of times the heading
appears in the result set rather than any other criteria (I'd be glad to
be corrected if I've misread this).

It would be nice to see something more sophisticated here - so if I've
searched for 'Emerson' and drop into the 'Author' facet, perhaps it
would be fair to make the assumption that authors called Emerson are
more 'relevant' to me than others. However, I can see weaknesses to this
approach - and it just goes to show we are some way off good relevance
ranking for this type of search. I suspect this is why I would prefer to
drop into an Alphabetical sort at this stage - the 'relevance' just
doesn't work well enough at the level of facet browsing. I think that
the s/w ought to be able to make some intelligent decisions here - if
I've entered a single word, then go to facet browsing, the likelihood it
can do good 'relevance' is very low, so alphabetical could be default.
If I've entered multiple words, some of which appear in author headings,
and some in subject headings and some in title, then the chances of
combining this data to give me a good relevance match increases, and
perhaps relevance ranking for facets is going to turn up trumps (this is
assuming that it gets more sophisticated than just 'number of hits')

There is a long way to go with 'relevance' in terms of library
catalogues - we clearly need information that goes beyond the bib
metadata in the catalogue to achieve something that the end user feels
'works' in the same way that people feel Google 'works' (not perfectly,
but overall pretty impressive) - and we haven't gone far enough in
bringing in contextual data (usage, previous searches, social network
info, recommended reading, personal data etc.). Even then we have to
recognise that we can only go far - we/our systems aren't psychic
(reminds me of http://bookworm.lboro.ac.uk/readinglists-todo.html - Next
Gen reading list functionality).

Owen

Owen Stephens
Assistant Director: e-Strategy and Information Resources
Imperial College London Library
Imperial College London
South Kensington
London SW7 2AZ

Tel: 020 7594 8829
Email: o.stephens_at_imperial.ac.uk

> -----Original Message-----
> From: Next generation catalogs for libraries
> [mailto:NGC4LIB_at_listserv.nd.edu] On Behalf Of Bernhard Eversberg
> Sent: 21 December 2007 10:20
> To: NGC4LIB_at_listserv.nd.edu
> Subject: Re: [NGC4LIB] Aqua Browser in beta at U. Chicago
>
> Stephens, Owen writes:
>
> >
> > Firstly, when I clicked through to the 'more' on the Author
> facets, I
> > found it frustrating that the default sort order was
> relevance rather
> > than alphabetical.
>
> The use of the word "relevance" is misleading. No software can guess,
> from the few words the user feeds it in every instance, whether
> something that matches the character strings is actually relevant
> to what the user had in mind.
> True relevance is always subjective - the machine _cannot_
> know anything
> about the user's motivations and intentions (in fact, it cannot _know_
> anything at all), it can only match character strings in more or less
> clever ways. There may be a _correlation_ between this and relevance,
> but a correlation is not the same as an approximation. Only the latter
> would justify the use of the word.
>
> B.Eversberg
>