I think the only reliable "somewhere" is a "somewhere" that you create
and manage. If you decided to add an index or a facet based on some
field that you didn't keep, would you have to go out and re-harvest all
n million records? That could be pretty inefficient. The long tail comes
into this as well. You might have seen the OCLC report on the statistics
relating to the Google 5 (it was published in D-Lib). They found a very
high number of records that are held in only one library. I had the same
experience when working on the MELVYL system -- about 2/3 of the records
had only one holding. So if you're trying to create a database the size
of WorldCat, and you want to be as comprehensive as possible, you're
talking about taking input from a large number of libraries in order to
begin to gather in the long tail. You don't want to do that harvesting
too many times.
kc
Ross Singer wrote:
> Wouldn't the originals exist somewhere? I mean, if the DC is being
> created from MARC, you could just point back to the original MARC
> record. This is roughly how Talis' Platform works.
>
> -Ross.
>
> On 4/25/07, Karen Coyle <kcoyle_at_kcoyle.net> wrote:
>> Well, maybe they don't need to be full MARC, but what's the harm in
>> keeping the entire MARC record? They average about 1K, so storage isn't
>> an issue. The thing is, it's hard to determine ahead of time what you
>> can throw away. For example, I'd like to keep all of the language codes,
>> which are only in the 008 and 041, and I'd want to store the meaning of
>> those codes because I'd want to do a display that goes: "In English.
>> Translated from the German." That may be in a note in the MARC record,
>> but you can't necessarily find it among the notes. I bet that music
>> folks would like to see (or be able to search) the 048 data on the
>> number of musical instruments ("for 2 cellos, 1 piano, 1 bassoon" but in
>> a coded form). If you look through the MARC record there is some useful
>> stuff. I bet if we threw anything away we'd regret it later.
>>
>> kc
>>
>> Ross Singer wrote:
>> > Why do these have to be full MARC records for the sort of solution
>> > we're talking about? Isn't this just some sort of indirection service
>> > to get the user to the local library? Do we really need another MARC
>> > record brokerage service?
>> >
>> > What I would really rather see is this sort of uber catalog that
>> > associates useful value-add services (that would most likely be
>> > outside of a MARC record) such as summaries, links to reviews,
>> > syndicated TOCs, dust jackets, etc. That /could/ be a wiki and that
>> > way such a project wouldn't have to get bogged down with who can
>> > authoritatively edit a MARC record correctly.
>> >
>> > But, really, for the sorts of services that we're trying to capture
>> > (namely, google pagerank, wikipedia, etc.), would we need anything
>> > more granular than DC?
>> >
>> > -Ross.
>>
--
-----------------------------------
Karen Coyle / Digital Library Consultant
kcoyle@kcoyle.net http://www.kcoyle.net
ph.: 510-540-7596
fx.: 510-848-3913
mo.: 510-435-8234
------------------------------------
Received on Wed Apr 25 2007 - 18:47:19 EDT