On 1/18/07, Jacobs, Jane W <Jane.W.Jacobs_at_queenslibrary.org> wrote:
> The problem is not that your data is in MARC, and you can't read MARC
> easily. (It can be translated, and pretty easily at that!) The problem
> is if the data contains what you don't need, lacks what you do need, or
> mixes things you need to parse separately.
The big problem with MARC is not the format (with a few tweaks, it
could be fine for the future, too), but what we put into it and how we
treat it. MARC as a format is fine. Really. Seriously. I'd attack the
*culture* of MARC, though, where we've got notation inside the fields,
our special rules and so forth. This is why there's been a bit of talk
about RDA of late which - if all is well - would fix some of the more
glaring errors.
Here's the big thing that makes MARC so difficult; the 'MAR' in MARC
should have been 'MAP' ; machine *processable*. Catalogers and
librarians everywhere have put information into MARC - yes, using
rules - but still, humans. And humans err. And humans can't know it
all. And humans forget. And humans conform to routines. And humans
continue to abstract. And humans are the ones that created the culture
of MARC.
As a solution I propose that we stop thinking of these items of ours
as 'bibliographic'. Most of the stuff in our collections will be
digitised, and as such metadata way outside the realms of
bibliographic will be of interest. We need better ways of dealing with
arbitrary notions and entities. We need to deal with semantics in a
reasonable fashion. And - and this is a BIG one! - we need to deal
with the rest of the world. The future of the library is not going to
be internal protocols and metadata schemes. If we are to deal with the
world we need to talk to the rest of the world. Is Amazon's metadata
less worth to us because it's not in MARC nor created with our special
bibliographic rules? If you answer 'yes', well then that's our problem
right there, and may well be the end of the library world as we know
it.
We need to turn the legacy of "bibliographic" around. Maybe we need to
completely define things anew, and I know that sounds daunting, but I
seriously think that technology available to us today will help
significantly. There's a reason why we do MODS and XOBIS and NIX and
other formats; we try - sometimes desperatly! - to get out of the
legacy of "the culture of MARC", the bibliographically centred world
as our objects more and more relate to other things we simply haven't
dealt with before. And to do that, we need to change, a helluva lot
more than what RDA proposes (a cleanup). Unfortunately, most of these
efforts are a very format centric effort (although both XOBIS and NIX
has some interesting data models, we need to dig into even
higher-level models for the library world at large).
I know things like the difference between a title and subtitle is of
perpetual debate (like, is the name of the author part of a title?),
but we're now faced with even bigger problems; what *is* an author?
What does it mean that a book of 10 essays have 15 authors? How should
a computer handle how to attach what author to what essay and their
roles? What *is* a book anyway as soon as it is turned into electronic
form? What does the size of it then matter? What does "pages" mean?
Will there even be indeces in the future? What's the difference
between a title and a name?
All these semantics needs to be rethought, as I really don't think
we're done here. All these things were of course much easier when
things like "books" were physical and the concept of a "collection"
always explained the object, but even these fundamental things that
the library culture and knowledge is built on are changing.
It's a hard problem, which is why I suspect nothing much is happening
in this area. Is that the end of it, then?
Alex
--
Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps
------------------------------------------ http://shelter.nu/blog/ -------
Received on Wed Jan 17 2007 - 18:41:00 EST