I'm having trouble getting clear enough in my head what it is I'm
looking for to explain it. Maybe I can provide enough ideas that
someone else can help.
I'm thinking of support for relationship-based browsing and profiling
of a collection, or a result set (let's call either one a 'corpus').
Using the subject vocabulary to categorize items in a corpus, or
allow a user to explore a corpus to see what's there, OR allow a user
to see easily see related subjects/topics/classes and navigate
between them to find the one that best meets the user's need.
These may not all be the same goals, but I think they're related, and
I think it's difficult to use LCSH as a basis for such goals. I
think relationships are at the heart of satisfying these goals, and
LCSH is problematic mainly because the nature of relationships in
LCSH is unsystematic, unpredictable, and duplicating (BT/NT
relationships exist, but inconsistently; then there is the
subdivision hiearchy, which is an entirely parallel set of
relationships; then there are relationships that are only implicitly
suggested by alphabetic proximity, since in a card catalog
environment that was sufficient; on top of all that we have
inconsistent use of inverted vs. direct order headings, and
inconsistent levels of pre-coordination).
Some of the features Laura mentions are helpful toward the goals I'm
talking about, but there are other things I'm having trouble putting
into words.
For an example of the problem, check out the famous AquaBrowser, for
instance at KCLS: http://www.kcls.org/
Check out the 'word cloud'. Time will tell if it's actually useful
to users. To me, it seems like a mess. But the basic idea is a good
one---the problem is that LCSH (and our other vocabularies) don't
provide vocabularies suitable to this kind of exploration. In the
case of AquaBrowser, it's not just based on LCSH, it's based on a
sort of random harvesting of any available words and an attempt at
machine-processing into a useful relationship graph, which is what
leads to a mess. But part of the reason it's not just based on LCSH
is because LCSH alone wouldn't actually work for this either. What
would it take to have a subject vocabulary which was amenable to this
kind of 'word cloud' for allowing users to explore, navigate, and
profile?
Alternately, look at the Flamenco demos:
http://flamenco.berkeley.edu/demos.html
That kind of interface for exploring is very powerful. Could you
provide it through LCSH? [In this case, the issue is 'facets', but
I'm not just talking about facets, or about pre-coordination vs.
post-coordination].
Finally, a somewhat unrelated note: Form/genre. Users are very very
interested in searching and browsing based on form/genre. Users do
not neccesarily distinguish between 'subject' and form/genre. LCSH
does include some form/genre information, but in a very un-systematic
way. A typical MARC record includes form/genre information in dozens
of different unsystematic and mutually incompatible ways. The same
kind of browsing/navigating/profiling/exploring I'm trying to
describe above, the user should be able to do in terms of form/genre
as well as 'aboutness', indeed both at once.
Hopefully this ramble is food for thought for someone.
--Jonathan
At 4:58 PM -0400 6/20/06, Laura Akerman wrote:
>I'm so glad this subject has come up!
>
>I was thinking of questions to pose to this group, but maybe it
>would work better to make some debatable statements (ya'll can agree
>or disagree and expatiate)
>
>1. For subject access, keyword indexing of full text is not good enough.
>2. Subject keywords in metadata are better than nothing, but
>controlled vocabulary is needed.
>3. Library of Congress subject headings are the only truly
>comprehensive English language subject controlled vocabulary, but
>they don't work well enough.
>4. A truly useful subject controlled vocabulary would support:
> a -- natural language searching
> b -- "exploding" hierarchical search (allow searching of all
>"narrower terms" and their "narrower terms" under a topic)
> c -- expression of more complex relationships between topics
>(relationship expressed by more than position in a string and "dash
>dash").
> d -- much more extensive references - perhaps using "term
>clusters" or some other means, to support easy links between "user
>vocabulary" and an identified concept, so that a smart catalog could
>lead the user to choose the concept they want for ambiguous terms
>(such as "records" with the sound recordings meaning versus business
>records versus peak sports performance) and bring them records on
>the subject they want, without their having to "learn" the
>controlled vocabulary.
> e -- choice of preferred term for a particular catalog, (or
>perhaps for a particular record?) without losing collocation and
>references.
> f -- automated or semi-automated subject assignment
> g - ? (what else?)
>
>What do we need? (leave how to get it for later) and what's out
>there that could be a model? How could the ideal subject vocabulary
>work?
>
>There!
>
>Laura
>
>--
>Laura Akerman
>Technology and Metadata Librarian
>Robert W. Woodruff Library, Room 128
>Emory University
>Atlanta, Ga. 30322
>phone (404) 727-6888
>fax 404-727-0053
>
>K.G. Schneider wrote:
>
>>>And we also need subject headings. Try a nice broad search in Google
>>>Book Search (tm) to see what retrieval looks like without them.
>>>
>>>
>>
>>"We need some kind of subject headings" and "we need LCSH" are not one and
>>the same. That's what I'm trying to get at. Going from LCSH to Google Book
>>Search is a false dichotomy.
>>
>>Karen G. Schneider
>><mailto:kgs_at_bluehighways.com>kgs_at_bluehighways.com
>>
Received on Tue Jun 20 2006 - 17:53:42 EDT