Re: wikipedia/author disambiguation

From: Ed Summers <ehs_at_nyob>
Date: Tue, 24 May 2011 16:59:08 -0400
To: CODE4LIB_at_LISTSERV.ND.EDU
Big +1 for promoting the use of the Authority Control Wikipedia
template.I know i'm being a bit of a broken record, but you can watch
as people add these by looking at or subscribing to:

    http://linkypedia.inkdroid.org/websites/23/pages/

Also, re: Jonathan's good advice to check out Wikipedia Miner [1] I
just ran across Duke [2] today, which looks like it could help guide
record linking a bit.

"""
Duke is a fast and flexible deduplication (or entity resolution, or
record linkage) engine written in Java on top of Lucene. At the moment
(2011-04-07) it can process 1,000,000 records in 11 minutes on a
standard laptop in a single thread.
"""

Haven't tried it yet, so YMMV, etc.

//Ed

[1] http://wikipedia-miner.sourceforge.net/
[2] http://code.google.com/p/duke/
Received on Tue May 24 2011 - 17:00:22 EDT