Re: Special OAIster Announcement from OCLC

From: Thomas Krichel <krichel_at_nyob> Date: Sun, 20 Sep 2009 12:39:01 -0400 To: NGC4LIB_at_LISTSERV.ND.EDU

  Diane I. Hillmann writes

> Sadly much of the discussion about what's happened to OAIster lacks
> real understanding about what OAIster represents and what the
> OAI-PMH protocol represents as an alternative distribution system.

  I would interested to see where my contributions demonstrate
  a lack of real understanding. 

> Whether or not the information from OAIster is now available in
> WorldCat or FirstSearch for "free," the loss of OAIster as an openly
> available OAI aggregator represents a huge loss, and not just to
> those sites that depended upon it to help distribute their
> information.

  Oh, come on, it's not that big a deal to reproduce an aggregator of
  data. You download DOAR, you fire up Tim's harvester the records,
  bingo. If I can do it, anybody can. It's bigger deal to produce a
  meaningful service because of the inadequate "Dublin Con" metadata
  that is used by default. For AuthorClaim, I need to figure out only
  four things:

  * title                                       easy
  * author names                                easy
  * stable id                                   tough
  * URL to a description of the resource        verrry tough

  That I still have to do, but I'd be happy to distribute what I can
  come up with.

> What many people don't understand is that OAI-PMH itself is not a
> discovery mechanism, neither indeed, was OAIster.

  I am not sure what you mean by "discovery mechanism" here. 

> In the OAI world, data providers make their records available to
> data harvesters, and those harvesters make services available to
> users. OAI-PMH is optimized for automated harvesting over time,
> multiple metadata formats, and incremental updating.

  The claim that it is "optimized" baffles me. Full disclosure: I was
  part of the techncial committee that designed OAI-PMH.  I pleaded to
  adopt a file based approach. With public rsync, it would have been
  lightening-fast. Instead, we have what I think of as a "digital
  ritual" that is cumbersome to maintain, and slow to use.

> Among the more common services might be discovery, but discovery was
> by no means the whole story.  OAIster aggregated information from
> most of the servers available in the OAI world--in some cases I
> believe they maintained data that was no longer available from its
> original source.

  You hit an important point here. Very roughly, from what I can see
  with the archives listed in DOAR, something like 60% are up,
  about 25% are down (kaputt?) and the rest shake like a cow's tail.
  So far about optimality of OAI-PMH.

> Without an aggregation service like OAIster, those who use the
> protocol to build information services must harvest from many
> individual servers, which may be tougher and more difficult to
> maintain.

  As I said up there, it's not that big a deal, but I am happy to do
  it and share what I have since I have to collect it anyway. Your
  implicit claim that it is hard to harvest from OAI-PMH sources is
  also in contradiction to your claim that OAI-PMH is "optimized".

> Even for those who preferred to harvest from individual
> sites, OAIster represented a backup source.

  Well, when I approached them, they were not eager to share
  what they had. I am much more eager to do it. I love resource
  sharing. Others are more sceptical, I know. 

> I believe this change should be another wake-up call for those who
> believe that having all services run by OCLC is a significant
> impediment to the healthy innovation that the library community
> needs to move forward.

  Absolutely. I could not agree more. But we need more sharing,
  and more specialization. 

  Cheers,

  Thomas Krichel                    http://openlib.org/home/krichel
                                RePEc:per:1965-06-05:thomas_krichel
                                               skype: thomaskrichel