LuSql[1] is a high-performance, simple tool for indexing data held in a
DBMS into a Lucene index. It can use any JDBC-aware SQL database.
It includes a tutorial[2] with a series of increasingly complex use
cases, showing how article metadata held in a series of MySql tables
can be indexed and how file system files containing full-text can also
be indexed.
It has been tested extensively, including using 6.4 million metadata
and full-text records to produce a 86GB index in 13.5 hours.
It is licensed with the Apache 2.0 license.
Glen Newton
[1]http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql
[2]http://cuvier.cisti.nrc.ca/~gnewton/lusql/v0.9/lusqlManual.pdf.html
--
Glen Newton | glen.newton_at_nrc-cnrc.gc.ca
Researcher, Information Science, CISTI Research
& NRC W3C Advisory Committee Representative
http://tinyurl.com/yvchmu
tel/tél: 613-990-9163 | facsimile/télécopieur 613-952-8246
Canada Institute for Scientific and Technical Information (CISTI)
National Research Council Canada (NRC)| M-55, 1200 Montreal Road
http://www.nrc-cnrc.gc.ca/
Institut canadien de l'information scientifique et technique (ICIST)
Conseil national de recherches Canada | M-55, 1200 chemin Montréal
Ottawa, Ontario K1A 0R6
Government of Canada | Gouvernement du Canada
--
Received on Tue Nov 25 2008 - 12:18:33 EST