Re: Search/retrieve access is to library data what Gopher was to the web?

From: Alexander Johannesen <alexander.johannesen_at_nyob> Date: Mon, 25 Aug 2008 14:26:49 +0200 To: NGC4LIB_at_LISTSERV.ND.EDU

Hiya,

Been out of the loop for a while, now living in the real world. Here's
my few cents worth after playing in the real world for a while. And
I'm a bit angry, so read my stuff in the voice of Donald Duck to
leverage ;

On Mon, Aug 25, 2008 at 13:08, Bernhard Eversberg <ev_at_biblio.tu-bs.de> wrote:
> At an international
> conference in 1976, I asked panelists (Fred Kilgour was among them) if
> MARC couldn't be simplified. They said they had tried very hard to
> make a new, easier, simpler design - but ended up reinventing MARC.

Is that because we actually need what MARC delivers, or because it's
the same people with the same ideas trying to do this? Let's start
with looking at what the hell MARC is these days
(http://en.wikipedia.org/wiki/MARC_standards) ;

    * MARC 21: the "harmonization" of USMARC and CAN/MARC; it is
maintained by the Network Development and MARC Standards Office of the
Library of Congress.
    * AUSMARC: national MARC of Australia, published by the National
Library of Australia in 1973; USMARC adopted in 1991
    * BIBSYS-MARC: used by all Norwegian University Libraries, the
National Library, all college libraries, and a number of research
libraries.
    * NORMARC: national MARC of Norway, based on MARC21
    * danMARC2: national MARC of Denmark, based on MARC21
    * INTERMARC: MARC used by Bibliothèque nationale de France
    * UNIMARC: created by IFLA in 1977, it is the official MARC in
France, Italy, Russia, Portugal, Greece and other countries.
    * CMARC: national MARC of the Republic of China(Taiwan), based on UNIMARC
    * KORMARC: national MARC of South Korea, KS X 6006
    * MARCBN: national MARC of Poland, based on MARC21
    * IDSMARC: (inter)national MARC of Swiss German University
Libraries, Luxembourg, Liechtenstein, based on MARC21

Ouch. And WTF. And a huge problem right there. And this isn't even the
exhausted list, just an indication that you've got bigger problems
than you at first think. I've in the past worked on a number or
unified sets, trying to make a MARC format that embrace most of these,
failing miserably every time. Why? Because librarians love to be
unique and different, no matter how much they claim to want it to be
otherwise.

Cataloging ain't that hard, but it's hard to agree on one set of
cataloging to fit all. But there's a bucket load of commons that could
be agreed to (and, in some sense, Dublin Core was a fantastic but
failed attempt at such), but ... but ... but ... here's the thing; The
library world is quite stale and in dire need of getting outside help
in doing these things. To solve this problem you need to think outside
the library box. You can't beat it with set theory, unified or not.
Indeces won't do the trick. Agreement, fuzzy or strict, across such a
diverse group of people can only be done through technology that
enables such. So why the bleeding friggin' bloody #%¤#¤&%#/¤ haven't
you guys embraced SemWeb / Topic Maps or similar (don't care which ..
any, just any!) where this is possible, exciting, proven and doable?

Look, I've seen all of the proposed MARC alternatives out there too,
from DC, MODS/MADS, even EADs, XOBIS, and other less known ones, BibX
/ Bibulus, BiblioML and so on. It's all shit. Seriously. And not
"shit" as in "bad", but as in rehashing and reinventing MARC. Why are
they so similar? Why do they pretty much do the same things? For
Pete's sake, FRBR was a fantastic step in the right direction, taken
... 15 years ago, still unproven (!!!) I've dabbled in FRBR in many
ways, most notably in Topic Maps. But all of it is dabbling,
prototypes, attempts, fiddeling around with this stuff, and the prime
reason nothing will ever get to the next level is the library culture
itself.

As long as you can roughly get by on MARC, the library world will
roughly stumble along. But we're facing breaking point, and I think a
lot of people (especially on this list) see it coming. In this day and
age when the library world struggles to find a place of future
importance (apart from being museum piece collections) we're supposed
to be the experts on information management. But what are we experts
at? MARC codes and subfield hacking. Marvelous.

>> Why do we use Z39.50, when nobody else does?  Why do we come up with ANY
>> standards that don't work well (if at all) with non-library entities?
>>
> Z39.50 came about at a time when there was no http, and the network
> infrastructure we have now had not been invented.

The shorter answer is "legacy", but the slightly longer answer is
"it's what we've got, and we've invested so much into it, we don't
*really* want to change it, besides, it works, doesn't it? If it
works, don't fix it."

> What's the language that
> everybody understands? Don't say XML, [...]

XML.

> that's only a punctuation standard
> when we need both a grammar and a spelling standard.

It's not just a punctuation standard. If you truly believe that, then
you truly have no idea what XML is. XML is a universal exchange markup
language with built-in notions of namespace-controlled vocabularies,
taxonomical structures and identity control. It has a great deal of
foundation that the library world needs. It may not be the Best Thing
Ever (TM), but as widespread technologies go, it friggin' rocks!

This brings me up to the next point that shits me so; MARCXML. Now, I
can understand why it went the way it did, creating that basis
platform of unstructured data at first, as a first step. But why, oh
why? (Waily, waily me!) didn't they simply use the benefits of XML
itself as sprinkling on top so that we could extend it to become
something actually useful? The mind not only boggles, but gets really
angry.

XML with some agreed-upon vocabularies and possibly a sprinkling of
identity control and mixed content would get us out of this MARC
pickle sooner than *anything* else. I'm baffled as to why no one seems
to argue this way or see it. It's obvious, and right there in front of
you. If your ILS could speak XML in any serious way (an no, a text
processor spitting out XML does not count) mixed content models would
be easy. And it's certainly not hard to implement, as XML libraries
(and if you're really smart, XSLT processors) exists for any platform
out there. Want to push your vendor into doing at least *one* right
move? Ask them to have XML import and export of mixed content models,
and divvy up the horrid MARCXML into namespaces, and then we can go
about saving this mess, one little piece at the time. Start by
creating schemas (not XSD if you can help it, of course :) of those
parts of MARC that is important (physical size of a book is not as
important as what the message of the book is), and get control of that
legacy, for Pete's sake.

>> Why do we pay exorbitant prices for MARC record data, when it should
>> come free from the publisher or distributor?
>
> Do you habe an idea how many publishers/distributors are involved?

That shouldn't have anything to do with it. This is purely a legacy of
the library world, and has got nothing to do with what we should be
doing. In fact I'll go further and say that the pricing model and the
sharing of MARC meta data model is the main reason the library most
probably will crumble to the ground. Of course there's a long history
of the library world being screwed over and given pittance, and that
more or less justifies why we're in this situation. But it doesn't
justify keeping it this way.

I think we all know there's a revolution needed not just in terms of
technical stuff, but in the way we think and act as librarians.
Knowledge will *always* be important. Knowledge about MARC no so much.
Knowledge about taxonomical structures, details of thesaurii hacking
will be. Knowledge about FRBR manifestations no as much.One thing that
must happen to enable this is that all libraries across the world, at
least those who cares, must share meta data -- freely! No strings
attached. Openly. And with a happy smile on your mouths. You must
embrace the idea that you don't own anything; we all do, we all own
our shared human legacy. The library world needs to worry more about
that than any local business model that some MBA in some council has
jammed down your throats. We need riled up librarians, passionate
about what they do and *why* they're doing it. Passion to change
things, and make it right again.

And let me state for the record that I've seen this passion in every
single bloody librarian and cataloger I've ever had the pleasure of
talking with! Every single one!

> I think many libraries stick with OCLC because there is no match for
> its comprehensiveness. There's just no real market (any more) for the
> kind of services they provide.

There *shouldn't* even be a market. The library ideal is not supported
by the current library business models. Just the fact that we call it
*business* models works against us here. We're not in any business,
except, perhaps, getting knowledge out to all those who seeks it.

>> If we are going to pay,
>> why isn't it something like 5 cents (or less) a record?  The best
>> choice, of course, is to simply get rid of MARC, but why have we
>> tolerated this treatment from vendors for so long?
>>
> Why indeed?

Very good questions. I suspect that because we have to pay for
records, there's not going to be a communal feel to he quality of
them, either. Besides, it's subscription, not ownership. If catalogers
/ librarians don't even feel ownership towards their most precious
legacy, then what are they doing? Really?

>> Why do we tolerate out-of-date, buggy ILS/OPAC systems, when they are
>> only (ultimately) inventory and customer management systems?  Worse still,
>> why do we pay more than a couple of hundred dollars for ILS/OPAC
>> systems?  They aren't that complex, from a programmatic/database
>> management standpoint.
>
> Here I disagree. Although I am myself a provider/vendor of a partially
> open source product that sells for just $400 and covers most functions
> a library may need. But this product's development has been subsidized
> and is not being marketed in a proper way.

The complexity in ILSes is because they need to cater to all the
special needs of every Dick, Jane and Harry in the library world.
There's so much bickering going on on the right way to format a date
string in a MARC field to fill a couple of books (and that's just from
the vocal ones; there's hordes of silent librarians / catalogers our
there that just mumble in the background, trying their best at getting
the meta data in there), not to mention how a serials issue should be
handled, barcodes generated, or call-slips generated (or not!). This
*could* be solved easily if MARC (or, perhaps more specifically the
systems that govern its content) had any sort of typed control
attached with a mixed content model (so that ILS's could be
modularized not only in software, but in the meta data they need and
handle). But alas, there's nothing, except AACR2 and big daddy RDA
that comes in a megaton of prose! Friggin' PROSE!

When all cataloging that is being done is done through computers, why
the bleedin' friggin' "%#¤"%/#%& don't you also create rules that
carries a software punch? RDA does *not* cut it; it misses the mark by
about a century.

>> All over the U.S. we see libraries closing, budgets being slashed, $0
>> budgets for new materials, open hours being significantly reduced, staff
>> layoffs, etc.  This is an era where we have to actually prove our worth
>> and value to our communities, provide services that our patrons actually
>> need, and live up to our own hype.  If we can't do that, then we are
>> little more than free bookstores with a few extra services tacked on.
>>
> For many purposes of many readers, today's library service is indeed
> overkill where it isn't obsolete. Not by far, however, for all of them.

The sad part is that I think the library world can offer something
that others never ever can; culture, emphasize, human interaction,
musky reality, knowledge context. All that's been surpressed over the
years in libraries everywhere, such as wisdom about books, opinions on
them, is now the very thing the world crave, needs, and with the
closure of libraries everywhere, won't get.

I won't blame all of this demise on MARC alone of course; I'll throw
in the current RDA stuff, the failings of FRBR and the lack of balls
everywhere to shake people in charge out of their false sense of
security through sleeping with vendors. Librarians everywhere needs to
wake up and realise that they are needed, wanted and much loved
throughout the world and that there's a small revolution that needs to
happen to make libraries future-proofed. (Well, a revolution that is
tiny by world standards, but obviously huuuuuuge by library-world
standards) The library legacy must be broken up, be shareable and
controllable, because, frankly, the old model of human librarians as
keepers and guardians of every little scrap of meta data just ain't
possibly with the onslaught of information heading our way ...

Live long and prosper!

Alex
-- 
---------------------------------------------------------------------------
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
------------------------------------------ http://shelter.nu/blog/ --------