Re: linked data

From: Jeremy Nelson <Jeremy.Nelson_at_nyob> Date: Tue, 11 Mar 2014 21:22:38 +0000 To: NGC4LIB_at_LISTSERV.ND.EDU

-----Original Message-----
From: Next generation catalogs for libraries [mailto:NGC4LIB_at_LISTSERV.ND.EDU] On Behalf Of Eric Lease Morgan
Sent: Tuesday, March 11, 2014 11:52 AM
To: NGC4LIB_at_LISTSERV.ND.EDU
Subject: Re: [NGC4LIB] linked data

On Mar 11, 2014, at 1:34 PM, Jeremy Nelson <Jeremy.Nelson_at_COLORADOCOLLEGE.EDU> wrote:

> I just did a soft release of a new catalog based on a design from Aaron Schmidt of Influx Design at http://catalog.coloradocollege.edu/(code repository is available on github at https://github.com/jermnelson/tiger-catalog). I'm using both BIBFRAME and schema.orgvocabularies along with MARC in a semantic storage backend (a combination of MongoDB, Redis, and Solr). This catalog, part of what I'm calling a catalog pull platform, is under active development and I would be very interested in getting feedback from this community.
> 
> --
> Jeremy Nelson
> Metadata and Systems Librarian
> Tutt Library, Colorado College
> Colorado Springs, CO 80903
> (719) 389-6895

> Interesting.

> Jeremy, when you did this, what was the overall strategy? Something like this:

 >  * export MARC records
 > * transform then into RDF
 > * load RDF into triple store
 > * index triple store
 > * provide search engine against index

> -
> Eric Morgan

Hi Eric,

My approach follows a lean startup inspired process of first creating a Minimal Viable Product (MVP) and then doing iterative Build-Measure-Learn cycles for testing and improving the product. In this catalog's first iteration I've made an engineering choice to store JSON representations for all Colorado College's MARC records along with BIBFRAME and Schema.org JSON-LD in MongoDB. Because of MongoDB's maturity, I'm able to query and manipulate both the MARC JSON and JSON-LD with the same set of tools (I'm using a combination of the Flask microframework, Knockout.js and Bootstrap). 

My current strategy is:

* Authority and Bibliographic MARC21 records exported from legacy ILS
* Authority and Bibliographic MARC21 records converted to JSON
* Selected fields from the MARC JSON converted to BIBFRAME and Schema.org JSON-LD documents
* All JSON stored in MongoDB as document collections
* Selected triples along and the entire JSON file are retrieved from MongoDB and indexed into the Solr search engine

In development now:
* RDF and JSON-LD interfaces to generic Work view (see a MARC21 example  at http://catalog.coloradocollege.edu/Work/52fe8be7650b8c1454122473)
* MODS from objects in our Fedora Commons collections ingested as BIBFRAME and Schema.org JSON-LD into MongoDB 

While there are many areas in the catalog that I hope to address in later iterations;  development priorities will be pulled from patrons, library staff and administration, along with the general needs for a more robust Linked Data system. Right now the catalog is limited to being a publisher of Linked Data, in the future I would like to implement a SPARQL endpoint to the semantic datastore.

 Jeremy