Re: Niggly little bits

From: Jonathan Rochkind <rochkind_at_nyob> Date: Tue, 1 May 2007 10:20:01 -0400 To: NGC4LIB_at_listserv.nd.edu

Mark Sandford wrote:
> Much has been said about the quality of catalog records in this
> discussion, and the general idea seems to be that other people are
> doing it better than we are.
I'm not sure this is the general idea. I think (some but certainly not
all) other people are controlling their data in a way more amenable to
machine action, to machines using this data in flexible ways.  I
certainly didn't mean to say anything more than this.

I don't know if there are any other people that control the volume of
data that our community does, with as good control. The problem is that
our data isn't as useful as it could be because of the _way_ we control it.

I think people in our community have a tendency to get defensive and
hear "You're saying just replace us with Google or Amazon" when I say
things like this. I am not. Personally, I certainly don't have much luck
using Amazon for topical searches "find me a book on X", and don't
generally use it for this. Was anyone suggesting this? People are
certainly allowed different experiences, a conclusion comes only from
usabilty testing. But I agree with you.

> FRBR friendly, I'll agree.  But FRBR seems much more focused on the
> known item search.
What makes you think "FRBR seems much more focused on the known item
search." To me, in order to start controlling our data in a way that can
be best used by machines, we _need_ to start thinking in terms of a
formal and explicit model for our data. Instead of using the informal
and implicit concepts we have been. The FRBR Model is an attempt to
formalize these implicit shared (or somewhat shared) understandings into
an explicit model, and while it's certainly not perfect, I don't know of
a better candidate to use as such a model. Which we desperately need.
Nothing about the FRBR model seems prejudiced toward uses of data for
known item searches to me, though.

> What system will change that?  Without a concerted effort to put in
> the niggly little bits--without taking the time and effort to encode
> the information we want to use--no catalog, no metadata scheme, and no
> set of cataloging rules will improve what we have. You can't search or
> sort data the computer doesn't have.
Very true. But you also can't search or sort data that has not been
stored in a way that can be searched or sorted effectively. You can't
let the user filter on just DVDs if there's no good way to extract
whether an item is a DVD from the record. You can't show the user only
journal holdings that contain volume 4 issue 12 if there's no good way
for a machine to extract this from the record. You can't show the user
all items that are adaptations of Hamlet (vs. all items that are
anthologies including Hamlet!--a very different thing!) if the record
does not contain the information in a way that can be extracted. Etc. etc.

I do not mean to imply that we no longer need "a concerted effort to put
in" good data, when I say that we ALSO need a concerted effort to
control our data in a way that software can use to provide the services
we need to provide.

Whenever I see discussions on these sorts of topics on library
electronic forums, it's always made into a weird either/or
two-camps-and-only-two-camps dichotomy. It is not.

Jonathan

> --
> Mark Sandford
> Special Formats Cataloger
> William Paterson University
> (973)270-2437
> sandfordm1_at_wpunj.edu
>

--
Jonathan Rochkind
Sr. Programmer/Analyst
The Sheridan Libraries
Johns Hopkins University
410.516.8886
rochkind (at) jhu.edu