Re: Aggregation of metadata

From: Till Kinstler <kinstler_at_nyob> Date: Tue, 16 Feb 2010 10:57:36 +0100 To: NGC4LIB_at_LISTSERV.ND.EDU

Marja Haapalainen schrieb:

> We are therefore looking for other, library or consortia driven, initiatives regarding the collection and aggregation of metadata 

In Germany there is the DFG Nationallizenzen funding programme (some 
information in English is on 
http://www.dfg.de/en/research_funding/programmes/infrastructure/lis/digital_information/library_licenses/index.html).
Though the primary goal of this programme is the acquisition of digital 
scientific content, the Nationallizenzen license agreement requires 
publishers to deliver metadata about the purchased content as well.
We (Verbundzentrale des GBV, a German library consortium) collect that 
data, analyze, convert and aggregate it. So far we have about 22 million 
records.
This data is available through some interfaces (SRU, Z39.50, unAPI) and 
as download for entitled users (institutions participating in the DFG 
Nationallizenzen). Some libraries use it in their metasearch portals, 
few actually do something with the data downloads (eg. index it in their 
own search interfaces, like ELIB Bremen: 
http://elib.suub.uni-bremen.de/). And we use it ourselves in the 
Nationallizenzen discovery interface 
(http://finden.nationallizenzen.de/) we just started.

The whole data aggregation workflow really is a pain. Much effort goes 
into analyzation and conversion of all the funny things you get from 
publishers (few deliver bibliographic record formats like MARC21, but 
most provide some homebrewn XMLish formats, Excel sheets, formatted text 
files... worst thing was a >10000 pages Word document containing 
"records" as formatted paragraphs, weird...).

Till

-- 
Till Kinstler
Verbundzentrale des Gemeinsamen Bibliotheksverbundes (VZG)
Platz der Göttinger Sieben 1, D 37073 Göttingen
kinstler@gbv.de, +49 (0) 551 39-13431, http://www.gbv.de