Re: next "next-generation library catalog"

From: David Pattern <d.c.pattern_at_nyob>
Date: Thu, 1 Jul 2010 10:41:03 +0100
To: NGC4LIB_at_LISTSERV.ND.EDU
Hi Oliver

It would be fascinating to compare and contrast the two methods :-)

Fortunately for us, our ILS/LMS had been collecting basic circ data since it was installed in 1995, so we had 10 years worth of data to start with.  I feel this has helped greatly with making the recommendations relevant.

Since adding this layer of serendipity to the OPAC, we've seen the average number of books borrowed by students increase by nearly 2 books per year per student (we have a student population of around 24,000 at the moment).

I've just tweaked our OPAC to start collecting click data in a similar way to you.  If you find a good format for releasing the data as Open Data, let me know and I'll make sure we do the same :-)

When we did some previous work on releasing usage data, I found that roughly 70% of the transactions could be mapped to an item that had an ISBN:
http://library.hud.ac.uk/usagedata/

regards
Dave


-----Original Message-----
From: Next generation catalogs for libraries [mailto:NGC4LIB_at_LISTSERV.ND.EDU] On Behalf Of Oliver Flimm
Sent: 01 July 2010 09:08
To: NGC4LIB_at_LISTSERV.ND.EDU
Subject: Re: [NGC4LIB] next "next-generation library catalog"

Hi,

On Wed, Jun 30, 2010 at 05:02:35PM +0100, David Pattern wrote:
> We've had recommendations ("people who borrowed this...") based on circ data in our OPAC since 2005 and it's had a huge (positive) impact on how stock circulates and on how many items students borrower per year, e.g.
> http://library.hud.ac.uk/catlink/bib/415607/cls/

when we started experimenting with the implementation of our own
recommendation system back in 2006 we evaluated both circulation data
of our library system and usage data of our KUG OPAC. We initially
went the circulation way to find out that:

a) we couldn't get enough data. A dataset for one borrowed book every
4 weeks (until its returned) results in fewer items to analyse and
thus worse statistics - although we have quite a lot of loans
(including ILL!!!) with a number of 1.015.450 circulations in 2009.
The number of items borrowed at a time range from 60.000 to 150.000 at
our library.

b) our library system is not very willing to give us information about
circulation events, so we had to export a daily snapshot of all items
borrowed, and then figure out the actual circulation activity - not
what I would call compelling...

c) we didn't want to struggle with german data secrecy obligations
concerning userdata, that are rather strict

Instead we then tried to analyse KUG usage. We suspected that every
time a user selects a specific title from a search result list, it
might be of interest to him and so we accumulated all those titles for
a specific anonymous session. Then we compared differend titles of
different sessions. Pretty thin... we thought initially,
until we analysed all those packets of titles per session.

The result was very good, although we had - like Amazon - 'false'
titles too. Interestingly enough these were quite often the result of
tutorials we offer to our users with always the same titles from
different subject areas ;-)

All of this usage data (and much more) is collected for the last 2-3
years in a separate statistics-database of our KUG system. Right now
we have registered around 5.600.000 clicks for a full title display.

BTW, does anybody know of an ontology to describe the events of those
clicks per session for the raw data we collected or recomendations at
all? So we could - like our bibliographic records - also release our
raw or processed recommendation data as Open Data ;-)

To be of any use for others it would also be essential to stick an
identifier to every media item, like ISBN. This would reduce the
number of recomendations but would make it much more usable elewhere.

Just my 0.02 EUR ;-)

Regards,

O. Flimm

--
Universitaet zu Koeln :: Universitaets- und Stadtbibliothek
IT-Dienste :: Abteilung Universitaetsgesamtkatalog
Universitaetsstr. 33 :: D-50931 Koeln
Tel.: +49 221 470-3330 :: Fax: +49 221 470-5166
flimm_at_ub.uni-koeln.de :: www.ub.uni-koeln.de


---
This transmission is confidential and may be legally privileged. If you receive it in error, please notify us immediately by e-mail and remove it from your system. If the content of this e-mail does not relate to the business of the University of Huddersfield, then we do not endorse it and will accept no liability.
Received on Thu Jul 01 2010 - 05:42:17 EDT