OCR PDFs

From: James Tuttle <james_tuttle_at_nyob>
Date: Fri, 17 Oct 2008 07:56:37 -0400
To: CODE4LIB_at_LISTSERV.ND.EDU
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I wonder if any of you might have experience with creating text PDFs
from  TIFFs.  I've been using tiffcp to stitch TIFFs together into a
single image and then using tiff2pdf to generate PDFs from the single
TIFF.  I've had to pass this image-based PDF to someone with Acrobat to
use it's batch processing facility to OCR the text and save a text-based
PDF.  I wonder if anyone has suggestions for software I can integrate
into the script (Python on Linux) I'm using.

Thanks,
James

- --
- -------------------------------
James Tuttle
Digital Repository Librarian

NCSU Libraries, Box 7111
North Carolina State University
Raleigh, NC 27695-7111
james_tuttle_at_ncsu.edu

(919)513-0651 Phone
(919)515-3031  Fax

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFI+H1zKxpLzx+LOWMRAgxIAJwNXyeMJbk6r6hmHpNAdEvWIQbCVgCgp8JR
nyS3WZ4UuRbU/6DTH7ohe/M=
=mT2T
-----END PGP SIGNATURE-----
Received on Fri Oct 17 2008 - 06:18:34 EDT