Re: Preliminary report on user research for eXtensible Catalog

From: Diane I. Hillmann <dih1_at_nyob> Date: Wed, 17 Jun 2009 14:18:34 -0400 To: NGC4LIB_at_LISTSERV.ND.EDU

Jonathan:

I think this is a really interesting thread, as the whole metadata 
aggregation path has been neglected up until now, and "re-discovered" as 
an approach with legs.

Jonathan Rochkind wrote:
> Interestingly, the Serial Solutions Summon product aims to be just 
> such a middle-man.  Actually, to me, the hard parts that you want this 
> 'middle man' to do isn't really replicating the search functionality, 
> but gathering, checking, and normalizing the metadata from many 
> sources on a regular basis.
I agree with this, but I think it's important to recall that Summon is 
designed to be the whole enchilada--not just an invisible middle man 
supporting a variety of user interfaces.  I believe their strategy is to 
market the whole experience, soup to nuts, including the library's MARC 
data.  I frankly can't imagine that they'd be all that interested in 
providing data only to other services, but it's certainly worth asking. 
One of the things you probably wouldn't get with that pre-normalized 
data is any transparency about what they've done to it, and I think 
that's a huge problem, especially if you want to integrate even more 
data with what you already have.

The other thing that's kind of interesting here is whether services such 
as Summon will be integrating data from Institutional Repositories into 
the mix.  Given their important relationships with providers of 
access-protected materials, this might be politically impossible, and 
yet one of their marketing points is that they limit the data they 
provide for you to what you already subscribe to, thus not presenting 
your users with hits they can't access. As the scholarly open access 
movement builds steam, do you suppose that IR data might be a "no-go" 
for a service depending on commercial providers?  Another good question 
to ask.
>
> Personally, I'd still want to _get_ this metadata into my local 
> system, like XC, rather than simply pay a vendor like Summon to 
> provide the search interface.  But I might still want to pay a vendor 
> to gather and normalize this metadata, that might be more cost 
> effective than trying to do it myself.
I think one of the important things to remember is that services like 
Summon are likely to normalize their data so it works well with their 
interface, not necessarily yours.  As someone who's spent a lot of time 
thinking about normalization and how to make aggregated metadata work 
well, I can tell you that there are lots of things that you can do with 
normalization and "improvement" (not the same thing, in my opinion) that 
can improve the user experience, but it really helps to know how your 
search engine works and be able to tweak both that and your data on your 
own, or with a group of like minded collaborators.  XC is using some of 
our ideas and experiences and one of the important areas where they 
stand out is that they're working to provide services to parse legacy 
MARC into RDA.  The FRBR-aware structure is going to vastly improve how 
users can move around the data, instead of having to plow through pages 
of results ranked by god-knows-what criteria.
>
> Summon is going to have to gather and normalize the metadata to 
> provide their product -- but their current business model doesn't 
> include providing you with a dump of the metadata so you can index it 
> in your local system, like XC.  Perhaps if enough people ask for it, 
> they'll consider if there's a way to make a business model off of that 
> too.
>
Maybe, and it's worth asking, given that not everyone is as prepared as 
thee and me to mess around with data normalization.  But if you've 
looked at their system demos you'll note that they have the same 
problems with personal names that everybody else does who tries to merge 
journal articles and MARC metadata.  I talked to Peter McCracken after 
the Midwinter demos, and mentioned that I thought the personal name 
problem was an important one to tackle, if a product like Summon is 
going to be the ginzu knife they want it to be.

Diane
> Jonathan
>
> Karen Coyle wrote:
>> As I recall, one of the issues with metasearch was that the searching 
>> itself put a great burden on the vendor systems -- since they 
>> received hits for every search done, even those not terribly relevant 
>> to their topic or offerings. Moving the search out closer to the 
>> user, and having the vendors only do delivery, should be appealing to 
>> them. What doesn't work, IMO, is to replicate this search 
>> functionality on hundreds of different systems. We need a middle man 
>> (or two or three) to provide a robust search function that libraries 
>> can then hook into. That search system can be optimized for search 
>> while libraries provide the UI and the A&I vendors do fulfillment.
>>
>> ? Possible?
>> kc
>>
>> Bowen, Jennifer wrote:
>>  
>>> Just to follow up, Jonathan asked me off-list whether we have active
>>> plans/thoughts about how to get this metadata from scholarly databases
>>> run by third party companies, and here's how I responded to him:
>>>
>>> This is going to have to be a negotiation process with these third 
>>> party
>>> companies, and some companies are going to be more receptive to this
>>> than others, just as some of these companies have been willing to make
>>> MARC records available for their content in the past, and some have 
>>> not.
>>> The rationale to use with the content vendors is that making metadata
>>> about their content more available through library discovery
>>> applications is going to increase interest in their products, and that
>>> this is better technology than metasearch because it doesn't impose
>>> limits on the number of resources retrieved from their database (such
>>> as, perhaps, the first 20 in a metasearch query).  We need to also
>>> reassure vendors that we are building technology that will address 
>>> their
>>> requirements, such as limiting access to the content itself to paid
>>> subscribers. 
>>> Obviously, this is going to take some effort to get a significant 
>>> number
>>> of content providers on board.  Libraries that use XC and that 
>>> subscribe
>>> to this content are probably in the best position to make a case for
>>> access to the metadata at the point when they are negotiating contracts
>>> - we can't really do it for them.  We see this as one activity that
>>> could be coordinated by the not-for-profit organization that we are
>>> forming to support users of the XC software. 
>>> Jennifer
>>>
>>> -----Original Message-----
>>> From: Bowen, Jennifer Sent: Wednesday, June 17, 2009 10:10 AM
>>> To: 'Jonathan Rochkind'; Next generation catalogs for libraries
>>> Subject: RE: [NGC4LIB] Preliminary report on user research for
>>> eXtensible Catalog
>>>
>>> XC's architecture is based upon aggregating metadata, and we are
>>> building a robust platform that will perform this aggregation, called
>>> the XC Metadata Services Toolkit (MST).  This open source software will
>>> support metadata from scholarly databases as well as from library
>>> catalogs and repositories.  We believe that this is a much more
>>> promising direction for future discovery interfaces than relying upon
>>> metasearch technology, although the two approaches may need to be used
>>> alongside each other in the shorter term. We just don't see a promising
>>> future in continuing to  develop new software that uses metasearch
>>> technology.
>>>
>>> Jennifer
>>>
>>> -----Original Message-----
>>> From: Jonathan Rochkind [mailto:rochkind_at_jhu.edu] Sent: Tuesday, 
>>> June 16, 2009 10:21 AM
>>> To: Next generation catalogs for libraries; Bowen, Jennifer
>>> Subject: Re: [NGC4LIB] Preliminary report on user research for
>>> eXtensible Catalog
>>>
>>> One of the things that stuck out to me in the report was user's 
>>> confusion about whether they could find articles in the catalog, as 
>>> well
>>>
>>> as user's unhappiness with having to learn new interfaces for 
>>> additional
>>>
>>> scholarly databases.
>>>
>>> Are you considering trying to address the scholarly database issue 
>>> with some kind of federated broadcast search, aggregated index, or 
>>> other means of attempting to integrate scholarly article results in 
>>> the main interface?
>>>
>>> Jonathan
>>>
>>> Montibello, Joseph P. wrote:
>>>      
>>>> Jennifer,
>>>>  
>>>> Thanks so much for sharing this useful and very interesting report
>>>>           
>>> with the community.  We talk so much about how we need to know what the
>>> users need.  As you say in the report, this is not a comprehensive,
>>> end-all be-all type of report, but it answers a few questions and
>>> prompts even more.       
>>>>  
>>>> Cheers!
>>>> Joe Montibello
>>>> Class of 1945 Library
>>>> Phillips Exeter Academy
>>>>  
>>>> ----------------------------------------------------------------------
>>>>
>>>> Date:    Mon, 15 Jun 2009 14:08:48 -0400
>>>> From:    Jennifer Bowen <jbowen_at_LIBRARY.ROCHESTER.EDU>
>>>> Subject: Preliminary report on user research for eXtensible Catalog
>>>>
>>>> (Posted on behalf of Nancy Fried Foster,
>>>>           
>>> nfoster_at_library.rochester.edu)
>>>      
>>>> The eXtensible Catalog project at the University of Rochester's River
>>>>           
>>> Campus
>>>      
>>>> Libraries is pleased to release the first report on the user research
>>>>           
>>> that
>>>      
>>>> we conducted in support of XC software development. We thank the
>>>>           
>>> Andrew W.
>>>      
>>>> Mellon Foundation and our user research partners - Cornell, Ohio
>>>>           
>>> State, Yale
>>>      
>>>> and the University of Rochester - for their generous support of this
>>>>           
>>> project.
>>>      
>>>> Use this URL - http://hdl.handle.net/1802/6873 - for a report that
>>>> summarizes the objectives, methods, and major software design findings
>>>>           
>>> from
>>>      
>>>> the data collected in the user research portion of the eXtensible
>>>>           
>>> Catalog
>>>      
>>>> (XC) project. A full analysis and interpretation of the data is not
>>>>           
>>> included
>>>      
>>>> in the present report and will be provided at the conclusion of the
>>>>           
>>> project.
>>>      
>>>> This report includes edited results from the brainstorming sessions
>>>>           
>>> and a
>>>      
>>>> list of the features that emerged from the analysis of those results.
>>>>           
>>> (See
>>>      
>>>> the eXtensible Catalog website at www.eXtensibleCatalog.org for more
>>>> information about the overall
>>>> project.)
>>>>
>>>>             
>>>       
>>
>>
>>   
>