On Mon, 2009-05-11 at 11:31 +0100, Jakob Voss wrote
> A format should be described with a schema (XML Schema, OWL etc.) or at
> least a standard. Mostly this schema already has a namespace or similar
> identifier that can be used for the whole format.
This is unfortunately not the case.
> For instance MODS Version 3 (currently 3.0, 3.1, 3.2, 3.4) has the XML
> Namespace http://www.loc.gov/mods/v3 so this is the best identifier to
> identify MODS.
And this is a perfect example of why this is not the case.
The same mods schema (let alone namespace) defines TWO formats, mods and
modsCollection.
To quote from the schema:
------------------------------------------------
***** An instance of this schema is
(1) a single MODS record:
-->
<xsd:element name="mods" type="modsType"/>
<!--
or
(2) a collection of MODS records:
-->
<xsd:element name="modsCollection">
<xsd:complexType>
<xsd:sequence>
<xsd:element ref="mods" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<!--
***** End of "instance" definition
-------------------------------------------------
So you're using the same identifier to identify two different things at
the same time.
We discussed this a lot during the development of SRU and there simply
isn't an existing identifier for an XML 'format'.
Also consider the following more hypothetical, but perfectly feasible
situations:
* One namespace is used to define two _totally_ separate sets of
elements. There's no reason why this can't be done.
* One namespace defines so many elements that it's meaningless to call
it a format at all. Even though the top level tag might be the same,
the contents are so varied that you're unable to realistically process
it.
Rob
Received on Mon May 11 2009 - 06:43:13 EDT