Re: Relevancy-ranking LCSH?

From: Diane I. Hillmann <dih1_at_nyob> Date: Mon, 5 Feb 2007 17:57:34 -0500 To: NGC4LIB_at_listserv.nd.edu

Karen:

As you (and most people) know, I rarely disagree with you, but I have
a somewhat different outlook on LCC.

>Tim, in part I think at one point you confuse LCSH and LC
>Classification. LC Classification shelves things in a single place; LCSH
>allows multiple subject headings to be added to a record.

In the physical book world where classification was used primarily
for shelving location, the one-class-number rule made sense, but in
the digital world, we've been released from those artificial limits
and can once again look at LCC as subject access different but not
inherently less important than LCSH.

>Apart from that, if you are looking at ranking, a few bits of info:
>1. When creating a MARC record, the first LC subject heading on the
>record is supposed to be one-to-one with the single LC classification
>number assigned to the item. I don't know if catalogers still do that,
>but it was true at one time. That would presumably make the first LCSH
>field be "more important" than the others.

This was certainly the common practice in the past, but it's
important to remember that some systems imposed other kinds of order
on multiple subjects, so it's not necessarily always the case.  And
of course, where classification doesn't exist or is used in multiples
(and not for shelving), relying on position as an indicator of
importance will be less reliable.

>2. The LC headings are both broad brush strokes and overly subdivided
>statements. Broad because they only address the overall topical thrust
>of the item [Medicine -- social aspects, on a book about how medical
>science has treated men and women differently as patients]; overly
>subdivided because LC (so they once claimed) subdivides topics such that
>no one topic gets more than 200 items under it. This means, of course,
>that some topics are much more subdivided than others, cf. anything
>under "united states - history - civil war"
>[United States -- History -- Civil War,
>1851-1865 -- Hospitals -- Periodicals; United States -- History -- Civil
>War, 1861-1865 -- African Americans -- Juvenile fiction.] as opposed to
>[Body piercing -- Pictorial works]. However, it seems that today there
>is a lot of subdividing going on, and in a browse in a large catalog a
>huge number of headings have only 1 or 2 items, including "Punk rock
>music -- California -- Berkeley -- History and criticism" (with one item
>in MELVYL).  _ _ I did hear that at around the same time that the MARC
>record became part of automated systems (and no one was trying to
>squeeze subject headings into the little margin at the top of cards)
>that the average number of subdivisions per heading AND the average
>number of headings per record went up measurably. So older works would
>have less to work with.
>
>I'd like to see some playing around with the LC class numbers. I think
>this would be difficult, but the classification is in machine-readable
>form and there is text associated with the numbering system. The
>advantage is that you have a real hierarchy, or I should say "some real
>hierarchies" because there isn't really an overarching one. You also
>have facets for things like geography and time.

I'd like to see more playing around with LCC, too.  And isn't the LCC
outline something of an overarching hierarchy?

Diane

>kc
>
><http://melvyl.cdlib.org:80/F/JRP876T1T711L3726TX2E65JSVJTU8Q3CHVKXE1EVBF8YXYYLM-03150?func=find-acc&acc_sequence=035021245>p.s.
>But what really gets me about the LC headings on records is exemplified
>by this: The 1948 edition of Norbert Wiener's book "Cybernetics" has
>only one subject heading: "Mathematical statistics." Since he had just
>invented the term "cybernetics" it didn't exist as a term. By 1965, when
>the book was revised and reprinted, the only subject heading is:
>Cybernetics. Nothing links the two, at least not in the subject heading
>world. In this example, user tagging would probably be more useful.
>(Note, the books do get similar shelf numbers in some of the libraries I
>can see on my screen.)
>
>
>Tim Spalding wrote:
>>I just wrote up a blog post about trying to tease relevancy ranking
>>from LCSHs:
>>
>>http://www.librarything.com/thingology/2007/02/can-subjects-be-relevancy-ranked.php
>>
>>
>>I wonder if anyone has made, seen or can think of any good methods to
>>do it. So far I've only seen non-ranked and popularity-ranked results.
>>In the blog post I talk about playing around with how LCSHs reinforce"
>>each other statistically, but I couldn't get the algorithm to produce
>>good results more than sporadically.
>>
>>I'm not sure if this is a cataloging or a coding. Maybe that's the point.
>>
>>Tim
>>
>
>--
>-----------------------------------
>Karen Coyle / Digital Library Consultant
>kcoyle@kcoyle.net http://www.kcoyle.net
>ph.: 510-540-7596
>fx.: 510-848-3913
>mo.: 510-435-8234
>------------------------------------