Re: web-based ocr

From: Eric Lease Morgan <emorgan_at_nyob>
Date: Fri, 15 Mar 2013 11:09:26 -0400
To: CODE4LIB_at_LISTSERV.ND.EDU
Here's an idea for web-based OCR:

  1. Have Web-based OCR available

  2. Make it easy for people to save
     content in a Web-accessible
     location thing like Box.net

  3. Allow readers (I don't use the
     word "users" anymore) to select
     items from their Web-accessible
     location and have them returned
     as OCR'ed texts

  4. Go a bit further and allow
     readers to do basic text mining
     on their corpus: word and phrase
     tabulations, word clouds,
     concordances, parts-of-speech
     analysis, named-entity
     extraction, etc.

Yea, I know. Much of of this has been done, but it has not been glued together.

--
ELM
Received on Fri Mar 15 2013 - 11:10:15 EDT