Because we had a large amount of checkout data to start with (from memory, it was around 2 million transactions over a 10 year period), we went for a data point of 7 or 8 (I'd need to double-check the code to find the exact figure).
Our "people who borrowed this, also borrowed..." service has been live since Nov 2005 and has increasingly grown in popularity, getting up to 4000 clicks per month. Our users are also able to view their entire circ history from within their account page on the OPAC.
Although I'd argue that we protect user privacy just as strongly in the UK as you do in the US, the UK's Data Protection Act allows for a more flexible framework for collecting user generated data. The bottom line is that data must not be used so that it identifies an individual and data must not be stored for longer than is necessary. Once a student graduates, their borrower record is deleted, and that breaks the link between the circulation transactions and a specific individual.
When we launched the service, I did expect we'd get a few queries from users (e.g. "what data is the library collecting?", "what does the library do with the data?", etc) but, to date, we've not received any.
regards
Dave Pattern
University of Huddersfield
________________________________
From: Next generation catalogs for libraries on behalf of Tim Spalding
Sent: Wed 21/05/2008 03:26
To: NGC4LIB_at_LISTSERV.ND.EDU
Subject: Re: [NGC4LIB] User Privacy (was: [NGC4LIB] bibtip (How it works))
What you people think is the appropriate amount number of data points
necessary to protect patron privacy in a recommendation system?
One point would be a situation where, if only one user took out or
looked at both Book A and Book B, the recommendation system would
reveal this coincidence. I contend this would violate patron
privacy-if you knew one book someone took out you could discover
others. The logic of small numbers would undermine the idea of
anonymity.
I'm thinking you need at least three, and probably more. John Blyberg
went for three or more in his SOPAC recommendations
(http://www.blyberg.net/2007/01/31/dynamic-item-recommendations/). I'm
not sure if that was for quality or privacy. That was based on opt-in
data.
Tim
This transmission is confidential and may be legally privileged. If you receive it in error, please notify us immediately by e-mail and remove it from your system. If the content of this e-mail does not relate to the business of the University of Huddersfield, then we do not endorse it and will accept no liability.
Received on Wed May 21 2008 - 02:11:43 EDT