Re: Are MARC subfields really useful ?

From: Jonathan Rochkind <rochkind_at_nyob>
Date: Fri, 4 Jun 2010 19:01:24 -0400
To: NGC4LIB_at_LISTSERV.ND.EDU
"My professional bias is toward fine granularity... But looking at how much time my (rare books) catalogers are spending on marking some details only for presentation, with detriment to the subject headings, for instance, I doubt the cost effectiveness."

Part of our general issue is our broken cooperative cataloging infrastructure.

To me, a little bit of human-created metadata, with the appropriate structure and granularity for machine use, is better than a LOT of metadata without that appropriate structure and granularity.

In my utopian cataloging/metadata universe, a record would possibly start out with just a little bit of "description" -- at whatever level the body doing the cataloging thought was appropriate cost/benefit for _their_ needs. But the metadata would be recorded _well_, with the proper structure and granularity (unlike what we have now). Then someone else might come along and add another data element or two or a dozen -- with the proper structure and granularity -- and when they did, everyone else sharing that record would automatically get their additions with little human intervention (because human intervention is cost).  And that "someone" could maybe even be a "patron" or "general public" -- you can get "good enough" data out of the general public if you have the right data model and the right software. Neither of which we have now.  And without that, you don't get good enough data even out of trained professionals.  Even with that, trained professionals might give you _be!
 tter_ data -- and in my ideal world, if a trained professional improves data (because their employing organization thought the improvement was justified by the cost-benefit tot heir orgnanization), those improvements would also immediately be automatically shared with everyone else with little or no human intervention.

This sort of universe is totally feasible from a technical standpoint -- it just takes the will to create it, resources put into it's creation (including skilled people to invent, design, and create it), and coordinated action by the library community toward the goal.  None of which we seem to have, and I'm not sure how optimistic I can be about us having them before it's "too late", and the library cataloging/metadata tradition essentially dies through neglect and the unsustainability of a current system whose cost-benefit is, as Kyle suggests, NOT effective or sustainable.

Jonathan
________________________________________
From: Next generation catalogs for libraries [NGC4LIB_at_LISTSERV.ND.EDU] On Behalf Of Dan Matei [Dan_at_CIMEC.RO]
Sent: Friday, June 04, 2010 5:35 PM
To: NGC4LIB_at_LISTSERV.ND.EDU
Subject: Re: [NGC4LIB] Are MARC subfields really useful ?

-----Original Message-----
From: Kyle Banerjee <kyle.banerjee_at_GMAIL.COM>
Date: Fri, 4 Jun 2010 09:00:47 -0700

>
> MARC is just a container. As such, you can normalize the data and structure
> to make the data useful for an application. But there is no technical fix
> for conceptual problems.





My professional bias is toward fine granularity and I'm not really close to apostasy :-)

But looking at how much time my (rare books) catalogers are spending on marking some details only for presentation, with detriment to the subject headings, for instance, I doubt the cost
effectiveness.

I acknowledge that if a record is to be exposed as MLA or APA or Chicago, subfield codes are much better than ISBD punctuation :-) However, I still wander if the price is right.

Let's look at the granularity from the "linked data" perspective.


Last year I spent some time in order to devise an XML schema more ambitious than MODS, able - for instance - to detect if the parallel sequences are "balanced", i.e. the sequence to the
right of = is isomorphic to the sequence to the left of =. (For the curious, I added at the end of the post the ridiculously long XML Schema fragment for this).

See two simplified examples (reduced to the title area):

<manifestation pml:guid="8f8d4b0f-161b-4204-b1cd-3571c1529c8b" pml:compilationTime="2010-06-04T22:21:13">
        <titleAndResposibilityArea>
                <titleProper xml:lang="en">Applications of ecological (biophysical) land classification in Canada</titleProper>
                <titleInformation>proceedings of second meeting, 4-7 April 1978, Victoria, British Columbia</titleInformation>
                <parallelTitleProper xml:lang="fr">Applications de la classification ecologique (biophysique) du territoire au Canada</parallelTitleProper>
                <parallelTitleInformation>compte rendu de la deuxième réunion, 4-7 avril 1978, Victoria, British Columbia</parallelTitleInformation>
                <responsibility>Canada Committee on Ecological (Biophysical) Land Classification</responsibility>
                <responsibility>compiled and edited by C.D.A. Rubec</responsibility>
        </titleAndResposibilityArea>
</manifestation>

<manifestation pml:guid="a657df1b-565f-4150-99e1-438d3acc22b5" pml:compilationTime="2010-06-04T22:26:24">
        <titleAndResposibilityArea>
                <titleProper>Contre les valeurs bourgeoises</titleProper>
                <responsibility>par Gilbert Ganne</responsibility>
                <titleProper>Pour les valeurs bourgeoises</titleProper>
                <responsibility>par Georges Hourdin</responsibility>
        </titleAndResposibilityArea>
</manifestation>

Now, imagine that I want to expose the last manifestation as RDF triples (I'm not a fan of RDF, but it's so fashionable these days...:-). How should I do that ?

As one triple ?

subject:  a657df1b-565f-4150-99e1-438d3acc22b5
predicate: title & responsibility area
object: Contre les valeurs bourgeoises / par Gilbert Ganne. Pour les valeurs bourgeoises / par Georges Hourdin

Or each element as a distinct triple ?

subject:  a657df1b-565f-4150-99e1-438d3acc22b5
predicate: title & responsibility area
object: a

subject: a
predicate: title group
object: a1

subject: a
predicate: title group
object: a2

subject: a1
predicate: title proper
object: Contre les valeurs bourgeoises

subject: a1
predicate: responsibility
object:  par Gilbert Ganne

subject: a2
predicate: title proper
object:  Pour les valeurs bourgeoises

subject: a2
predicate: responsibility
object:  par Georges Hourdin

The second solution seems over kill to me. Besides, I loose the order of the two titles within the area !


So, which is the reasonable granularity ?

Dan



====================================================================================
Appendix
<xs:group name="group.title">
        <xs:sequence>
                <xs:element name="avantTitre" type="pml:type.statement" minOccurs="0"/>
                <xs:element name="titleProper" type="pml:type.statement"/>
                <xs:element name="parallelTitleProper" type="pml:type.statement" minOccurs="0" maxOccurs="unbounded"/>
                <xs:sequence minOccurs="0" maxOccurs="unbounded">
                        <xs:element name="or" type="pml:type.statement"/>
                        <xs:element name="alternativeTitle" type="pml:type.statement"/>
                        <xs:sequence minOccurs="0" maxOccurs="unbounded">
                                <xs:element name="parallelTitleProper" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelOr" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelAlternativeTitle" type="pml:type.statement" minOccurs="0"/>
                        </xs:sequence>
                </xs:sequence>
                <xs:sequence minOccurs="0" maxOccurs="unbounded">
                        <xs:element name="subtitle" type="pml:type.statement"/>
                        <xs:sequence minOccurs="0" maxOccurs="unbounded">
                                <xs:element name="parallelTitleProper" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelSubtitle" type="pml:type.statement"/>
                        </xs:sequence>
                        <xs:sequence minOccurs="0" maxOccurs="unbounded">
                                <xs:element name="or" type="pml:type.statement"/>
                                <xs:element name="alternativeTitle" type="pml:type.statement"/>
                                <xs:sequence minOccurs="0" maxOccurs="unbounded">
                                        <xs:element name="parallelTitleProper" type="pml:type.statement" minOccurs="0"/>
                                        <xs:element name="parallelOr" type="pml:type.statement" minOccurs="0"/>
                                        <xs:element name="parallelAlternativeTitle" type="pml:type.statement" minOccurs="0"/>
                                        <xs:element name="parallelSubtitle" type="pml:type.statement" minOccurs="0"/>
                                </xs:sequence>
                        </xs:sequence>
                </xs:sequence>
                <xs:sequence minOccurs="0">
                        <xs:element name="genre" type="pml:type.statement"/>
                        <xs:sequence minOccurs="0" maxOccurs="unbounded">
                                <xs:element name="parallelTitleProper" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelOr" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelAlternativeTitle" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelSubtitle" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelGenre" type="pml:type.statement" minOccurs="0"/>
                        </xs:sequence>
                </xs:sequence>
                <xs:sequence minOccurs="0" maxOccurs="unbounded">
                        <xs:element name="titleInformation" type="pml:type.statement"/>
                        <xs:sequence minOccurs="0" maxOccurs="unbounded">
                                <xs:element name="parallelTitleProper" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelOr" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelAlternativeTitle" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelSubtitle" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelGenre" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelTitleInformation" type="pml:type.statement" minOccurs="0"/>
                        </xs:sequence>
                </xs:sequence>
                <xs:sequence minOccurs="0" maxOccurs="unbounded">
                        <xs:element name="part">
                                <xs:complexType mixed="true">
                                        <xs:sequence>
                                                <xs:element name="number" type="pml:type.statement" minOccurs="0"/>
                                                <xs:group ref="pml:group.title" minOccurs="0"/>
                                        </xs:sequence>
                                        <xs:attributeGroup ref="pml:att.elementMetadata"/>
                                        <xs:attributeGroup ref="pml:att.sortMetadata"/>
                                </xs:complexType>
                        </xs:element>
                        <xs:sequence minOccurs="0" maxOccurs="unbounded">
                                <xs:element name="parallelTitleProper" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelOr" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelAlternativeTitle" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelSubtitle" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelGenre" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelTitleInformation" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelPart" minOccurs="0">
                                        <xs:complexType mixed="true">
                                                <xs:sequence>
                                                        <xs:element name="number" type="pml:type.statement" minOccurs="0"/>
                                                        <xs:element name="title" type="pml:type.statement" minOccurs="0"/>
                                                        <xs:element name="subtitle" type="pml:type.statement" minOccurs="0"/>
                                                        <xs:element name="genre" type="pml:type.statement" minOccurs="0"/>
                                                        <xs:element name="titleInformation" type="pml:type.statement" minOccurs="0" maxOccurs="unbounded"/>
                                                        <xs:element name="responsibility" type="pml:type.statement" minOccurs="0" maxOccurs="unbounded"/>
                                                </xs:sequence>
                                                <xs:attributeGroup ref="pml:att.elementMetadata"/>
                                                <xs:attributeGroup ref="pml:att.sortMetadata"/>
                                        </xs:complexType>
                                </xs:element>
                        </xs:sequence>
                </xs:sequence>
                <xs:sequence minOccurs="0" maxOccurs="unbounded">
                        <xs:element name="responsibility" type="pml:type.statement"/>
                        <xs:sequence minOccurs="0" maxOccurs="unbounded">
                                <xs:element name="parallelTitleProper" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelOr" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelAlternativeTitle" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelSubtitle" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelGenre" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelTitleInformation" type="pml:type.statement" minOccurs="0"/>
                                <xs:element name="parallelResponsibility" type="pml:type.statement" minOccurs="0" maxOccurs="unbounded"/>
                        </xs:sequence>
                </xs:sequence>
        </xs:sequence>
</xs:group>
Received on Fri Jun 04 2010 - 19:02:54 EDT