Re: Adding EAD to the 'layer of discovery'?

From: Derek Rodriguez <darodrig_at_nyob> Date: Wed, 23 Dec 2009 10:02:32 -0500 To: NGC4LIB_at_LISTSERV.ND.EDU

Hi Mark,

    The TRLN libraries did recently complete the incorporation of EAD 
documents into our Endeca-based system.  Thanks for calling attention to 
this effort!  Initially conducted as a pilot, 
<http://www.trln.org/endeca/task-groups/ead/index.htm>, we took this 
into production in August.  Currently, we are harvesting over 6,400 EAD 
encoded finding aids nightly from Duke University Libraries, the Duke 
University Medical Center Archives, NC State University, and UNC Chapel 
Hill. We support search and display of EAD content in the consortium UI, 
Search TRLN, <http://search.trln.org>, and each of our member 
institutions' scoped Endeca interfaces. 

    For indexing purposes, we extract a handful of EAD fields using XSLT 
and merge them with content from collection level MARC records before 
indexing.  The <eadid/> is added to the MARC records to facilitate this 
match.  Since we define our own data model in Endeca, we can map the 
contents of each EAD element to an appropriate field.  The elements we 
index include <bioghist/>, <overview/>, and <scopecontent/> and 
<unititle/> at all levels of the <dsc/>

    We use some of the <ead/> sourced fields stored in Endeca to support 
display such as <accessrestrict/>, <userestrict/>, and <prefercite/>.  
To display <abstract/>, <overview/>, <bioghist/>, and <dsc/> content we 
retrieve the EAD documents on-the-fly, parse them with XSLT, and display 
them in tabs on our full record screen as shown in the "Ammons papers" 
example you provided.  These representations are intended to support 
discovery and so we still link out to the finding aid of record 
maintained by each archives department.

    The main benefit to users is discovery of archival materials 
along-side published materials and keyword indexing of specific EAD 
elements.   Your point about relevance ranking is a good one since many 
of the elements in our EAD documents are significantly larger than most 
of the metadata records in our indexes making it possible that records 
matching on these fields could appear at the top of most results lists.  
To counteract this, we refined the Endeca relevance rank settings to 
weight matches in these fields much lower than matches in other fields.  
For facets, we don't actually populate facets with metadata from the EAD 
records, we just use the metadata in the collection-level MARC records 
for this.  Since our archives departments had been maintaining MARC 
records for these finding aids for several years, this did not represent 
a change in workflow.

    You also mention advanced search options.  The goal of this project 
was make EAD-sourced content available to all users in our standard 
interfaces.  So at this point, we have not implemented advanced search 
functionality specifically geared toward EAD content in this discovery 
layer.  That said, some of our libraries do provide an advanced search 
option for granular searches of their 'finding aids of record'.  A good 
example is that offered by NCSU 
<http://www.lib.ncsu.edu/findingaids/search/advanced>.

    I hope this is helpful.  Please let me know if you have questions.

    Derek

-- 
Derek Rodriguez
Program Officer
Triangle Research Libraries Network
CB# 3940, Wilson Library
Chapel Hill, NC 27514-8890
919-962-8022 fax:919-962-4452
derek_at_trln.org
http://www.trln.org

Custer, Mark wrote:
> I'm curious if anyone on the list has experience with adding their EAD documents into a larger discovery system?
>
> Here are two examples of what  I mean:
>
>
> *         Triangle Research Library Network now indexes (and displays) entire EAD documents.
>
> Example (in which I've restricted my results to "archival materials" and entered "ammons" as my keyword):
>
> http://search.trln.org/search?Nty=1&Ntk=Keyword&Ntt=ammons&N=200092
>
>
> *         University of Chicago library's implementation of AquaBrowser seems to index entire EAD documents.
>
> Example (in which I've searched for "American Automobile Brief History", quotes included, and where the first 3 results returned should be for archival finding aids):
> http://lens.lib.uchicago.edu/?q=%22american%20automobile%20brief%20history%22
>
> So, this leads me to three questions in particular:
>
>
> 1.       Can anyone point me to any other online examples of "discovery tools" that are ingesting entire EAD documents?  Summon, Encore, Primo, Blacklight, etc.??? (but, again, I'm not asking about OPACS that only search a surrogate of the EAD)
>
>
>
> 2.       For those of you that are including the entire EAD in your library's discovery tool, did you already have surrogate MARC records for those collections in your catalog?  If so, how are you dealing with those now that you're adding the EAD?
>
>
>
> 3.       What do you think of whole retrieval experience (advanced search options, facets, incorporation into the relevancy algorithm, etc.)?
>
> Thanks in advance for any and all advice and/or other examples that might be out there,
>
>
> Mark Custer
>
>