Re: Creating pdfs from images and their text

From: raffaele messuti <raffaele.messuti_at_nyob>
Date: Fri, 17 Jan 2014 10:24:48 +0100
To: CODE4LIB_at_LISTSERV.ND.EDU
Padraic Stack wrote:
> What is a straightforward way to combine the text with overlaid images
> to create searchable pdfs?

having transcription in hOCR[1] format the tool you should need is
hocr2pdf[2].
i never tried for pdfs, years ago i made some djvu following this
tutorial[3]

[1] http://en.wikipedia.org/wiki/HOCR
[2] http://manpages.ubuntu.com/manpages/lucid/man1/hocr2pdf.1.html
[3] https://philikon.wordpress.com/2009/07/23/digitizing-books-to-djvu/

ciao.

--
raffaele
Received on Fri Jan 17 2014 - 04:25:26 EST