Re: The Situation We're In (was Re: Authority maintenance )

From: Ted P Gemberling <tgemberl_at_nyob> Date: Wed, 30 May 2007 15:27:56 -0500 To: NGC4LIB_at_listserv.nd.edu

Jonathan,
Thanks for your summary. More about that below.

Karen wrote:

"My guess is that we are going to improve our catalogs incrementally
..."

I appreciate that. That's often the safest way to do things. It enables
you to see the costs and benefits at each step. If you try to go too
fast, there's a good chance you'll regret something later.

"Beyond that, we also need to embrace incoming data and resources that
differ from library standards so that we can be seen as a source of all
information, not just "library" information."

A big part of Mann's position is to emphasize that libraries and the Web
really do have different values. No one denies that we need to catalog
some things on the web. After all, journals are going electronic, and
there's probably no way, and no reason, to stop that. At the very least,
it will solve huge problems of space. The web is one media of
publication. But libraries, especially in certain fields, are made up of
more than journals, and libraries convey more than information. They
convey knowledge, a higher, more integrated level of awareness. I don't
want to annoy people with a lot more philosophical postings, but at the
bottom I'm going to copy and paste something I posted several weeks ago
that states my own personal take on the difference.

On his blog (http://bibwild.wordpress.com/2007/05/25/broken-huh/),
Jonathan writes:

"There are very basic questions of high interest to our users that our
data set is unable to answer, even though we are spending time recording
information that ought to be available to answer these questions. One
very good example-and it's just one example-is Roy Tennant's analysis of
the inability to say whether full content is available online even
though we are already spending time recording URL information."

Now, I'm not going to say we definitely shouldn't make that crystal
clear on our OPACs, if there's some way to do so. (I'm not an electronic
or media cataloger, so that's kind of out of my department.) But I do
have to ask how much of a burden that uncertainty really is on users.
This seems to assume that as soon as you enter your search query and get
a result set, full-text content should be immediately discernible. I
realize it is on many electronic databases. But is that really a major
problem for researchers? To have to click a few more times?

In his publications, Thomas Mann emphasizes that real research is a
complex, difficult process that often has to be approached from various
angles. It takes time. And you often need training from reference
librarians on what to look for if you're in an unfamiliar area.

Having not read the Autocat postings Tennant refers to, I don't really
know why catalog records do not indicate full text in many cases. But
I'm guessing that it's something that wasn't regarded as important to
the designers of the records, in comparison with other things.

"The metadata system/environment we have now was very intelligently
optimized for the social, economic, and technical context of the mid
20th century."

I'd opine that it's, at the very least, optimized for the last decade of
the 20th century. Personally, I think it's optimized for this decade,
but there's absolutely no justification for claiming it's as archaic as
the mid-20th century. A lot of advantages have come with online
catalogs: information is accessible in many more ways today, even if the
content that was on cards has remained constant to a certain extent.

Another point I imagine someone might bring up would be
post-coordination as a "better tool" than precoordination, since it's
more "web friendly." The best thing I know on that topic is this piece
by Mann from the Bibliographic Control for the New Millennium
conference:
http://www.loc.gov/catdir/bibcontrol/mann_paper.html
It's long, but worth reading.

Jonathan, I'll look more at your blog and the responses to it. Thanks to
Bernhard and Alexander for their postings on this thread, too.
        --Open-mindedly yours, Ted Gemberling

Libraries and the Web (with personal references removed)

Here's a stab at how we might distinguish the purposes of libraries and
the Web. I think libraries, as public institutions, are in the business
of preserving information that the public (or maybe better, the "body
politic") has decided is important. The things which are necessary for
education, research, public safety, and other concerns. That isn't
really contradicted by public libraries' fiction sections, because they
just show that the "body politic" has decided it's important to provide
entertainment, too. Nor is it contradicted by some libraries being
privately owned, because even if they're private--unless they're just
"libraries" in people's homes--they have to reflect "public" concerns to
some extent. Otherwise no one will use them.

In contrast, the Web is centered on the interests of individuals. It is
often ... "loose data." It is the realm of freedom and personal
preference, and somewhat of chaos. Great sites like IMDb or Google exist
because people want to look for things outside what is provided by the
public institution of libraries. If you're a film buff like me, you
won't be satisfied by what libraries can give you. And we wouldn't want
to make libraries tell us everything about movies. At least not most
libraries.

This isn't to say you can't publish things, even "serious" things like
electronic journals, on the Web. Though the "serious" ones are more
likely to come with a price. Maybe I should say the Web is a realm that
contains both "raw" and "controlled" data, and librarians select
strictly from the things they've decided are important.

On the Web, it's questionable that one really has an inalienable right
to anything. I'm sympathetic to "Net Neutrality," but I wonder if we
might have to realize that as an entity that exists for individuals'
whims and interests, the Internet may not be able to provide equal
access to everybody. That may be another important purpose of libraries,
to provide a place where individuals who can't afford fast access to it
at home can get it. But capitalism may hold sway on the Web, as in most
forms of publishing.

Here's an example of the value of "loose data." I catalog 19th century
books, and many of them have signatures that are pretty illegible.
Sometimes I can only guess at how to read people's handwriting. Google
is a terrific source for deciphering the signatures at times. LC's Name
Authority File can help somewhat, but it's a lot farther from containing
every personal name that has ever existed than Google. On Google, I can
try different possible readings of the names and see which ones have
matches. After I do that, I may go to the NAF to see if there's a
corresponding heading.

As a library cataloger, my job is to translate that "loose data" into
something that isn't "loose." Of course established headings exemplify
"non-looseness." When something goes from the realm of the private to
the public, looseness has to stop for the most part. Transcriptional
fields like the 246 are looser, but even they are governed by some
strict rules.