Re: MarcXML and char encodings

From: Sheila M. Morrissey <Sheila.Morrissey_at_nyob>
Date: Tue, 17 Apr 2012 16:22:53 -0400
To: CODE4LIB_at_LISTSERV.ND.EDU
In XML standard:

	It is RECOMMENDED that character encodings registered (as charsets) with the Internet Assigned Numbers Authority [IANA-CHARSETS], other than those just listed, be referred to using 	their registered names; other encodings SHOULD use names starting with an "x-" prefix. XML processors SHOULD match character encoding names in a case-insensitive way and SHOULD 	either interpret an IANA-registered name as the encoding registered at IANA for that name or treat it as unknown (processors are, of course, not required to support all IANA-	registered encodings).


As I suggested -- since MARC8 isn't (so far as I know) registered -- you won't get far with most standard tools, in whatever language -- you'll have to extend them to first recognize the encoding name, and second, decode the content.

smm

-----Original Message-----
From: Jonathan Rochkind [mailto:rochkind_at_jhu.edu] 
Sent: Tuesday, April 17, 2012 4:19 PM
To: Code for Libraries
Cc: Sheila M. Morrissey
Subject: Re: [CODE4LIB] MarcXML and char encodings



On 4/17/2012 3:01 PM, Sheila M. Morrissey wrote:
> No -- it is perfectly legal - -but you MUST declare the encoding to BE Marc8 in the XML prolog,

Wait, how canyou declare a Marc8 encoding in an XML 
decleration/prolog/whatever it's called?

The things that appear there need to be from a specific list, and I 
didn't think Marc8 was on that list?

Can you give me an example?  And, if you happen to have it, link to XML 
standard that says this is legal?
Received on Tue Apr 17 2012 - 16:23:59 EDT