Re: "Repositories", OAI-PMH and web crawling

From: Diane Hillmann <metadata.maven_at_nyob>
Date: Mon, 27 Feb 2012 08:31:30 -0500
To: CODE4LIB_at_LISTSERV.ND.EDU
On Mon, Feb 27, 2012 at 5:25 AM, Owen Stephens <owen_at_ostephens.com> wrote:

>
> This issue is certainly not unique to VT - we've come across this as part
> of our project. While the OAI-PMH record may point at the PDF, it can also
> point to a intermediary page. This seems to be standard practice in some
> instances - I think because there is a desire, or even requirement, that a
> user should see the intermediary page (which may contain rights information
> etc.) before viewing the full-text item. There may also be an issue where
> multiple files exist for the same item - maybe several data files and a pdf
> of the thesis attached to the same metadata record - as the metadata via
> OAI-PMH may not describe each asset.
>
>
This has been an issue since the early days of OAI-PMH, and many large
providers provide such intermediate pages (arxiv.org, for instance). The
other issue driving providers towards intermediate pages is that it allows
them to continue to derive statistics from usage of their materials, which
direct access URIs and multiple web caches don't.  For providers dependent
on external funding, this is a biggie.

Diane
Received on Mon Feb 27 2012 - 08:32:00 EST