Re: The next generation of discovery tools (new LJ article)

From: Jonathan Rochkind <rochkind_at_nyob> Date: Mon, 21 Mar 2011 13:36:13 -0400 To: NGC4LIB_at_LISTSERV.ND.EDU

You're doing this with in-house code? Interesting.

So do you grab ALL the results from both Solr and Primo, so you can 
merge them?  I'm surprised that doesn't create a performance problem. 
I'm also curious how you manage to normalize relevancy from the two 
systems.

All in all, this is interesting work, with a lot of tricky details, and 
I think you should write up the details for the Code4Lib Journal/ :)

Jonathan

On 3/21/2011 1:17 PM, Joshua Greben wrote:
> Hi Jonathan,
>
> Yes, the blending in this context is exactly as you describe. More specifically, we take the top relevancy-ranked results from our Solr engine, and the top relevancy-ranked results from the Primo Central API. Each set of results comes with a relevancy score for each document in the set of results. Naturally, the scores are different so we normalize them. The results that come from the Solr engine are interfaced by using a Java object that the Solr software creates (a Collection of Solr document records). The results from the Primo Central API are parsed with the help of a schema provided by Ex Libris as part of the Primo Central API. The Primo Central record is then converted into a data type that matches what is already in the Solr Java object. We insert the Primo Central results into that object using the results from the normalized relevancy algorithm to find the right place to insert, and that is the way we are able to display a combined relevancy-ranked result set.
>
> Josh
>
>
> Joshua Greben
> Systems Librarian/Analyst
> Florida Center for Library Automation
> 5830 NW 39th Ave,
> Gainesville, FL 32606
> 352-392-9020 ext 246
> jgreben_at_ufl.edu
>
>
>
>
>
> On Mar 21, 2011, at 12:37 PM, Jonathan Rochkind wrote:
>
>> On 3/18/2011 2:41 PM, Jean Phillips wrote:
>>> At FCLA and other places there are people working on the ability to include megaindexes of articles and others into their locally developed or open source Discovery Tool.  We've recently been able to blend the results from Ex Libris's Primo Central Index in with our local repository of metadata from the catalog and digital collections sources.
>> When you say "blend"... what do you mean exactly?  You really mean blend, hits from the external index interspersed with hits from your local repository, with some relevance algorithm merging the two sets?
>>
>> How ever did you manage that?
>>