indexing word documents using solr

From: Eric Lease Morgan <emorgan_at_nyob>
Date: Tue, 10 Feb 2015 11:12:09 -0500
To: CODE4LIB_at_LISTSERV.ND.EDU
Can somebody point me to a good tutorial on how to index Word documents using Solr?

I have a few hundred Microsoft Word documents I want to search. Through the use of the Tika library it seems as if I ought to be able to index my Word documents directly into Solr, but none of the tutorials I have found on the Web are complete. Missing directories. Missing files. Documentation for versions unreleased. Etc.

Put another way, Tika can create a (nice) XHTML file complete with some useful metadata that can all be fed to Solr for indexing, but I can barely get out of the starting gate. Have you indexed Word documents using Solr, and if so, then how? 

—
Eric Morgan
Received on Tue Feb 10 2015 - 11:14:59 EST