We also use pdftotext and have been happy with it.
--
Chad Mills
Programming Coordinator
Ph: 732.932.8573 x123
Fax: 732.932.1386
Cell: 732.309.8538
Rutgers University Libraries
Scholarly Communication Center
Room 409D, Alexander Library
169 College Avenue, New Brunswick, NJ 08901
http://rucore.libraries.rutgers.edu/
----- Original Message -----
From: "Eric Lease Morgan" <emorgan_at_ND.EDU>
To: CODE4LIB_at_LISTSERV.ND.EDU
Sent: Tuesday, June 21, 2011 10:28:39 AM
Subject: Re: [CODE4LIB] PDF->text extraction
On Jun 21, 2011, at 10:23 AM, Owen Stephens wrote:
> We've tried iText but had issues with quality
> We moved to PDFBox but are having performance issues
I have been satisfied with pdftotext which is a part of the Xpdf suite of tools -- http://bit.ly/kIHD1x
--
Eric Lease Morgan
University of Notre Dame
(574) 631-8604
Received on Tue Jun 21 2011 - 10:51:14 EDT