Re: bibtip (How it works)

From: Kevin M Kidd <kiddk_at_nyob> Date: Sat, 17 May 2008 00:08:59 -0400 To: NGC4LIB_at_LISTSERV.ND.EDU

Hello Tim,
Thanks for the very interesting reply. I apologize for calling the "ant navigation" analogy faulty. Reinforcement is indeed a something that needs to be considered in such systems, I am just not sure it is - in the case of BibTip -  as big a problem as you believe. For a detailed technical explanation of the algorithm used in BibTip, see Research and Advanced Technology for Digital Libraries. 2003. (Lecture Notes in Computer Science; 2769), by Dr. Andreas Geyer-Schulz, et. al.

Beyond that, I am not entirely clear what else you are disagreeing with. My responses are below:

>So, I'm very interested in this topic and think BibTip is an
interesting test. That said let me disagree with most of your email.

>>> In fact, your "ant navigation" analogy is a faulty one in this case. BibTip works astoundingly well, and it is not because it simply follows "where users go". Instead, BibTip uses "Repeat Buying Theory" as a framework to statistically analyze user search behavior. Repeat Buying Theory is a highly successful and well-tested statistical framework to describe the regularity of repeat-buying behavior of consumers within a distinct period of time.

>So, I don't want to get snippy, but I am pretty familiar with the
statistical problems of this approach and of others, and have done
extensive work-and with appropriately large datasets-on LibraryThing.
Pardon the contradiction, but nothing about my description of the "ant
tracking" problem was faulty, so let me explain it again.

Kevin: Large datasets are obviously better, as is the length of time the that the co-browsing data has been collected and analyzed. BibTip has been running at Karlsruhe since 2002 and their cataloged collection size is more than 15 million documents in 23 libraries. Statistical problems aside, it has been quite a successful experiment. I encourage you to search their catalog at http://www.ubka.uni-karlsruhe.de/hylib/suchmaske.html

>>> The developers of BibTip at Karlsruhe University very skillfully adapted this theory to the session-based search behavior of library OPAC users. They key is that BibTip only records the inspection of the full details of an individual bib record selected from a larger list of search results. It does not "follow" the user.

>I understand it doesn't "follow the user" on two legs, but it records
what books discrete users visit and then makes statistical inferences
from it. This amounts to a picture of where are and where they go,
albeit without the order-of-events data which, actually, would improve
it.

Kevin: As I understand it, the algorithm is not concerned at all about discrete users. It is concerned simply with discrete pairs of records and session identifiers. Record pairs are analyzed, no other aspect of user behavior matters. I would be interested to know why order-of-events matters in this context?

>>> In this framework, clicking-on and reading the full details of a given record is an economic choice. The choice of one record over all of the others in a given list is viewed as an economic choice, very similar to individual's choice to purchase one thing over another during a given trip to the store. There is a real cost in time (e.g. an economic cost) for the user each time he/she selects and views a record. It can be assumed that the "search cost" to a user is high enough that he/she is willing only to view the details of a record which is truly of interest. Users, in effect, are self-selecting. That is, users with common interests will select the same documents, and, since recommendations are only provided to users from the full details view, we can surmise that recommendations are only offered to interested users.

>All this is obvious. But systems built on statistics have flaws. One
of them is reinforcement-the ant problem. The more you recommend
something the more people will follow your recommendations and the
more co-occurrences there will be. Success breeds success. The ant
trail, once started, has a tendency to get stronger. This is true in
any recommendation system, unless you adjust for it explicitly-which
has statistical problems too.

>At base, the quality of the recommendations from a screen-watching
model are related to three factors: (1) how good is the searching? and
(2) how costly is failure?

>Take Amazon purchases. Finding a book on Amazon is easy, and the cost
is high. So there are few mistakes-people tend to buy the books they
want, so the signal is strong. The main problems relate to agency,
people buying books for other people. That's hard to correct for, so
you just hope that you can discern signal from noise when the quantity
of data is great.

>Unfortunately, the OPAC is not Amazon. Leaving aside scale-WorldCat
receives 0.7% of the hits Amazon does!-there's the issue of search
quality. Bad OPAC search means that people spend a lot of time on
detail pages they weren't aiming for. Search for "Da Vinci Code" at
SPL, for example,and you get a page full of results without an actual
paper copy of the Da Vinci Code. Results matter. People hate reading
results, they hate revising searches and they hate looking at
subsequent search pages. Many would rather dive into a record quickly
and back out. They're rather dive in and see if they can leverage
partial success. Personally, I've learned to click on some version I
don't want-the Spanish or the eBook-knowing there's an author link
there I can leverage to get to the paper version I really want. I
suspect I am not alone. And each time I do it, I create noisy data for
a recommendation system.

Kevin: Noise is indeed an issue, though, since BibTip functions without needing to know how something was searched (e.g. it does not record the search terms that got a user to a particular record), I am unclear as to how the quality of a particular search matters. Diving into a record quickly and backing-out quickly falls well within the repeat-buying theory model. Indeed, repeat buying theory predicts random co-browsing (diving-in) very well - in fact, that is the very point of the theory! Recommendations are based upon those records that fall outside of regular random co-browsing - the outliers. To quote Dr. Geyer-Schulz (one of the developers of BibTip):

"Ehrenberg's theory faithfully models the noise part of buying processes. That is, repeat-buying theory is capable of predicting random co-purchases of consumer goods. Intentionally bought combinations of consumer goods--a six-pack of beer, spareribs, potatoes, and barbecue sauce for dinner, for example--are outliers. In this sense, Ehrenberg's theory acts as a filter to suppress noise (stochastic regularity) in buying behavior." [From: Andreas Geyer-Schulz, Andreas Neumann und Anke Thede. An Architecture for Behavior-Based Library Recommender Systems. Information Technology and Libraries 22(4), p.169 (2003).]

That is, *most* of the given transactions are noise. Searches terms and strategies are irrelevant. The co-browsing of records that lies outside of the what is called the logarithmic series distribution is the browsing that needs to be examined for potential recommendations.

I would point out that your example of clicking on a Spanish or an e-book version of a record to get to a paper version would not necessarily constitute noise in this model.

>>> In order to build relationships among given documents, BibTip analyzes record pairs.

>That's the beginning of a good system. Among various improvements,
Amazon keeps track of order, because order matters. If 50 people who
look at the Spanish-language Harry Potter also look at the English,
that's interesting. That 49 of 50 went from the Spanish to the
English, that's more interesting. It suggests the Spanish should
recommend the English more highly than the reverse.

Kevin: In this model, both the Spanish and the English versions would be recommended (and correctly, I think).

>>> For each record X that has been viewed in the full details view of the OPAC, a "purchase history" is built. This is simply a list of all of the sessions in which record X has been viewed. Record X is then compared with all other records (Y) which have been viewed in the same session as X. For each pair of records (X,Y) that have been viewed in the same session, a second purchase history is built. The number of users who have viewed record X and another record Y in the same session is statistically analyzed and the probability of a "co-inspection" of records X and Y in a given session is calculated. A recommendation for record X (That is, users who liked X also liked.) is created when record Y has been viewed more often in the same session that can be expected from random selections.

>I do hope they have a threshold-that a single incidence of
co-occurrence will never trigger a suggestion. Otherwise it's a
privacy problem waiting to happen. John Blyberg discussed this problem
when SOPAC was released.

Kevin: See my response about noise above.

>>> This "repeat buying theory" is remarkably good at automatically determining relevant recommendations for a given item. It takes some time for enough data to be collected so that good recommendations are available for a substantial part of a collection, but what is the hurry? Of course, the longer you have the algorithm running, the better your recommendations become. The more users you have, the better your recommendations become. But, time is on our side in this case ;-)

>Well, except that you also need to expire data, or weight it less over
time. That, ten years ago people were examining the Bible and the
Bible Code is not an accurate predictor of what Bible readers want
today.

Kevin: I am not sure about the need to create a special process to expire data in this context. Expiration of the data occurs naturally as a result of the recalculation of the algorithm (again, see Lecture Notes in Computer Science 2769). Presumably people have been examining the bible and other related materials in the intervening 10 years? Over time, the recommendations will reflect user preferences.

>>> Frustratingly, for all the talk here and elsewhere of the features of next generation catalogs, I rarely find anything that convinces me that librarians understand that collecting/harvesting and re-using user (and usage) data is the key to most (if not all) of the services we want these new catalogs to provide. Without seriously thinking about the implications of harnessing collective intelligence - and taking steps *now* to build systems that do - we are not going to get very far. BibTip as a service is a big step in the right direction.

>Absolutely. I agree. I agree completely. I am a fan of the idea.
Nobody should take my words as a wet blanket on the fires of
experimentation. But moving past my desire to promote interesting
things and into analysis and experience with the topic, I am skeptical
on two fronts:

>1. There are real privacy implications to collecting user data. I
think they can be solved, but they cannot be dismissed. And solving
them hurts your data quality/quantity.

Kevin: My knee-jerk reaction is to dismiss the privacy implications. But, I know that this cannot be done, as you say. I do believe that, given the current information environment, our patrons will be much more amenable to user/usage data collection. There are many, many possibilities here, from user-built profiles (we have to give them a reason to want to build those profiles, though) to more algorithmically-analyzed usage data (BibTip is a great example of this).

>2. I am skeptical that libraries can accumulate enough high-quality
data to compete against other systems.

Kevin: You rightly point out that the critical mass problem is a big one. But, I don't know that we *really* need to compete with anyone. There are 14,000 students at Boston College and I can think of a lot of things we can do with data we could readily begin collecting. When it comes to things that really require critical mass - like tagging, reviews and ratings - we need to begin to develop platforms that can link users, usage and bib data across universities (I am really not qualified to comment on the needs of public libraries in this context). We have the technology now to begin doing this.

>For curiosity's sake, I am tempted to try the idea, leveraging the
LibraryThing for Libraries traffic. But, for the reasons above and
having experimented a lot around with recommendation data from
different sources and of different qualities, I'm very skeptical that
OPAC-based path-watching will ever be a significant source of
recommendations. But you may label me an interested party-we sell
recommendation data to libraries.

Kevin: As you can tell, I completely disagree that OPAC-derived data will never be a significant source of recommendations. BibTip is the proof that a significant system can be built - right now with modest technology and modest collection sizes.

That said, I would be very interested to know how LibraryThing - a system which I admire very much - builds its recommendations.

Thanks again,
Kevin
--------------------------------------
Kevin M. Kidd, MA, MLIS
Library Applications & Systems Manager
Boston College Libraries
Phone: 617-552-1359
Fax: 617-552-1089
e-Mail: kevin.kidd_at_bc.edu
Blog: http://datadrivenlibrary.blogspot.com/