We have veered way off the topic... (was Are MARC subfields really needed?)

From: Ted Koppel <tpk_at_nyob> Date: Mon, 7 Jun 2010 07:33:06 -0700 To: NGC4LIB_at_LISTSERV.ND.EDU

All,

I think we have lost sight of the big picture here, and have gotten
caught in the quicksand of fields and subfields.

This thread began with observations of data structuring.  I think that's
where  the discussion really is.     One of the original questions was
(simplified and paraphrased) "why do we need to structure bibliographic
data the way we have been doing so?"   To me, the answer is - "we don't
need to structure it the way we have" BUT  **we need to structure it
somehow**.   The benefits of structured data are manifold - elements can
be manipulated, sorted, externally referred to, etc. - if they can be
identified.   Unstructured data is ambiguous and difficult to work with;
structured data is more logical and flexible.

The specific structure fundamentally doesn't matter *to the data*.
It's the application that makes the difference.   We use MARC and MARC
manipulation because we have the tools that have been developed for the
last 30+ years.  It's understood, it mostly works, and it's pretty
content rich.  But -  We could just as easily use ASN.1/BER (or
ASN.1/DER), XML, SGML ... or any of a 100 other ways to structure the
data.   

Traditional ILS OPACs (and for the matter, NG OPACs) are simply and
*only* for displaying the structured data.   As long as you know where
to find that data ins a structure (whatever structure - MARC or
something else) then combining it, presenting it, sorting it, is the
responsibility of the application, not the data.   So if someone wants
to have a bibliographic record display to a user with two main entries
(note that I am purposely *not* saying two 1xx fields), that's the OPAC
application's responsibility, not the data's.    You want to browse by
title?   Fine - but that's an application and presentation issue, not a
data issue.  As long as the data is structured in some sensible way, it
can be done.

If there's a real issue here, it is that OPAC developers have cleaved
too closely to the MARC standard (probably because library RFPs asked
for it) and they haven't aggressively decoupled the storage model from
the display model.   THAT's the discussion we ought to be having.

Ted

tpk_at_auto-graphics.com

-----Original Message-----
From: Next generation catalogs for libraries
[mailto:NGC4LIB_at_LISTSERV.ND.EDU] On Behalf Of Michele Newberry
Sent: Monday, June 07, 2010 9:59 AM
To: NGC4LIB_at_LISTSERV.ND.EDU
Subject: Re: [NGC4LIB] Title browse within the new systems (was Are MARC
subfields really needed?)

Jim,

1.  I'm not sure what you mean by "various titles proper filing
together" as though that's a bad thing.  I thought the point of a title
browse function along the lines of traditional OPACs was the objective?

BTW, in the "War and peace" search, there is an example of the problem
if the subfield C isn't encoded properly.  There, sorted alphabetically
is a clear "remainder of title page" was:

War and peace atlas

War and peace / by Leo Tolstoy ; translated by Constance Garnett.

War and peace : chapter notes and criticism, including Leo Tolstoi's
Some words about War and peace /

Looking at the full MARC display, we see:

War and peace / |b by Leo Tolstoy ; translated by Constance Garnett.

If the "b" were a "c", then this title would be in its proper place.

2. Our traditional browse function exists because librarians wanted it.

Keyword wasn't sufficient no matter how much we tuned the relevance
ranking to make sure that a title keyword search brought the logical
matches to the top.  They say they were guided by user behavior and
requests for the function.  I'm sure there is some truth to this.  There
are users of our tools who were once card catalog users and they bring
certain expectations to our discovery tools as well.

3. Regarding your historical research in catalogs (presumably pre-AACR2;
pre-MARC), perhaps its because early bibliographers/catalogers valued
the authors of the works more that the titles?  In other words, it was
the creators of the works that had the most value.  On top of this, with
many fewer users than we have today, perhaps the phenomenon of "its that
big book with 'peace' in the title, I can't remember the author" wasn't
as common?

4. Common titles such as "Bury's Later Roman History" are excellent
candidates for an alternative title field - one that clearly shows this
to be a made-up title rather than something that existed on the
publication itself.  Of course, putting that title in the record
requires thought and time on the part of the cataloger.  Maybe we don't
have the resources to do that either?  Or is this where tagging comes
into the picture?  Let the public's contributions make up for what we
don't want to pay anyone to do anymore?  A keyword search can find the
work without knowing the proper title too.  Which is why keyword was
such an incredible addition to our OPACs a couple of decades ago. At the
time, of course, we say keyword/boolean as an adjunct to the
traditional, pre-coordinated indexes not as a replacement.

5. In all honesty, my reaction to the lack of detailed coding in other
sources of metadata is because the creators of those forms chose to not
understand the value of AACR2 and MARC when those forms of metadata were
first being proposed.  I heard a presentation on Dublin Core at OCLC
Users Council when it was first being developed.  I stood up and said
"it looks like cataloging to me" and was told "Shush, don't tell the
geeks; they think they invented it."  IMHO, the coding in MARC isn't
that hard to do or that hard to learn, especially in the main
description fields (245-3XX).  Maybe the geeks just weren't smart enough
to figure out tools to help their users do it?

I'm not defending every aspect of either AACR2 or MARC - certainly it
behooves us to look at the ROI of what we're doing and determine what we
can afford.  I will remind you that MARC itself is just the method of
encoding the rules and is not a cataloging standard itself.  The rules
existed for a 2nd reason - to clearly identify unique bibliographic
entities for posterity - because it was deemed important to know those
differences - which is why we don't put any old made-up title in the 245
and why we transcribe the remainder of the title page and encode verso
content in proscribed ways.  So that, a century from now, a bibliophile
can tell whether the work represented by the cataloging data was the
actual work desired or some other manifestation of it.  If this
objective is no longer needed, then a lot of specific coding can be
fuzzified.  There are a lot more complex parts of MARC that are
candidates for this than the 245 subtitle IMHO.

- Michele

Weinheimer Jim wrote:

> Michele Newberry wrote:

> <snip>

> Jim,

>   You might want to look at our Endeca-based library catalog to see an

> example of a title browse within an interface that normally doesn't 

> support this type of browsing.

> http://catalog.fcla.edu <http://catalog.fcla.edu> 

> Click on the "Search begins with" radio button and after the screen 

> refresh, type your title.  You can also select Author and Series.  

> Lack of the Subject option is an indicator that we just couldn't quite

> work out all the issues of those pre-coordinated index entries within 

> this technology.

> 

> In this instance, I think it aids the user not to have the content 

> from the subfield c in the display so that subfield has some value to 

> me.  We find some value in the subfield b for relevance ranking 

> purposes when we're trying to bring the probably most likely results 

> to the fore.  We call it the "on the road" test.  This uses the words 

> be searched as a percentage of the words in the title.  

> Differentiating the subtitle is helpful here.

> </snip>

> 

> Thanks for sharing this. Certainly, it is a much better display, but 

> if I search for War and Peace, I still find various titles proper 

> filing together. Still, my experience with people is that they almost 

> never know the exact titles of an item they want. Citations are very 

> often incorrect, and the need for browsing titles proper is far more 

> important to librarians and catalogers than to the public. [As an 

> historical aside, from my researches of early catalogs, some *never* 

> made an entry for title, sometimes not even for the Bible. If the 

> cataloger could find no author to enter the record under, they would 

> place these records into an "Anonymous, Pseudonymous Works" or 

> something similar.]

> 

> My suspicion is that in the public's mind, much more common is what
used to be termed the "catchword title", e.g. they would think "Bury's
Later Roman History", and not "History of the later Roman Empire" or
"Professor Thompson's book on Alfred Hitchcock" instead of "The moment
of Psycho : how Alfred Hitchcock taught America to love murder".

> 

> Just to make it clear, I am *not* saying we should stop coding the
subtitle separately, primarily because it is codified in ISBD. But its
utility does have to be reanalyzed seriously in our new environment,
along with *every other part* we do. There are also consequences to
consider: if we want to accept metadata from other providers that do not
code the subtitles separately, do we continue to edit the subtitles
locally? Is that a wise use of our resources? 

> 

> Yet, if we just accept these other records without recoding,
consistency falls apart and what does that mean for quality? If we do
not consider the implications and consequences of all of this, then when
higher authorities ask what someone has done in the last week, they
certainly will wonder when they hear: "I've added 245$b to 400 records!"
and when this higher authority asks why this is so important, we won't
be able to point to any adverse consequences, so there will be no other
answer than: "it's the correct way to do it." 

> 

> Is this the best use of the staff? 

>  

> James Weinheimer  j.weinheimer_at_aur.edu <mailto:j.weinheimer_at_aur.edu>
Director of Library and 

> Information Services The American University of Rome via Pietro 

> Roselli, 4

> 00153 Rome, Italy

> voice- 011 39 06 58330919 ext. 258

> fax-011 39 06 58330992

> 

>   

-------------------------------------------------------------------------

The contents of this e-mail and any attachments are intended solely for the use of the named addressee(s) and may contain confidential and/or privileged information. Any unauthorized use, copying, disclosure, or distribution of the contents of this e-mail is strictly prohibited by the sender and may be unlawful. If you are not the intended recipient, please notify the sender immediately and delete this e-mail.