On Tue, Apr 20, 2010 at 8:39 AM, Alexander Johannesen
<alexander.johannesen_at_gmail.com> wrote:
> It does surprise me somewhat that all you smart folks don't get
> together and create a framework for washing and cleaning up MARC
> records (making it convertible to whatever else you want or need).
I think there are multiple reasons for this, the big one being the
time and effort of someone(s) who is both technologically adept and
comprehensively familiar with the cataloging rules (and all of the
variants).
Even bigger, honestly, is the fact that "cleansed" data might only be
fractionally better (or a lot, it's hard to know until it's done) and
that our current systems wouldn't work any better even with more
semantically rich data.
And now, if you can identify the resources in your data, it's a whole
lot easier to fill out the gray areas from sources like dbpedia or
freebase than worrying about regexes.
I hope the gravity of that last sentence resonates with somebody.
-Ross.
Received on Tue Apr 20 2010 - 09:03:28 EDT