Re: pdf2txt

From: Eric Lease Morgan <emorgan_at_nyob>
Date: Fri, 11 Oct 2013 13:45:37 -0400
To: CODE4LIB_at_LISTSERV.ND.EDU
On Oct 11, 2013, at 11:57 AM, Peter Murray <peter.murray_at_lyrasis.org> wrote:

>> For a limited period of time I am making publicly available a Web-based program called PDF2TXT --http://bit.ly/1bJRyh8
> 
> Very neat.  I couldn't get the 'network diagram' link to work (from http://dh.crc.nd.edu/sandbox/pdf2txt/pdf2txt.cgi?cmd=search&id=1381506693&query=public%20library).  How hard to you think it would be to do stemming before some of the subsequent processing.  The bi-grams "public libraries" and "public library" are usually the same thing.

Peter, alas, the network diagram is not functional, yet. Stemming before hand, hmmm… I suppose that is possible. --Eric Morgan
Received on Fri Oct 11 2013 - 13:46:05 EDT