Re: unwanted (bogus) characters in marc

From: Thomas Krichel <krichel_at_nyob>
Date: Thu, 7 Oct 2010 14:17:14 +0200
To: CODE4LIB_at_LISTSERV.ND.EDU
  Ere Maijala writes

> # Fix non-UTF-8 characters with two highest bits set (we assume they
> are actually ISO-8859-1)

  What about

use Encode::Guess qw/latin-1/;
$decoded=decode("Guess", $dodgy_input);

  $decoded then should be a utf-8 string with utf8 flag on.


  Cheers,

  Thomas Krichel                    http://openlib.org/home/krichel
                                http://authorclaim.org/profile/pkr1
                                               skype: thomaskrichel
Received on Thu Oct 07 2010 - 08:18:37 EDT