On 3/30/2011 11:33 AM, Till Kinstler wrote:
> 1/0) is dating back to the 1970s. And some conclusions from that made it
> even into libraryland as early as the 1980s (s. for example writings by
> Charles Hildreth, one article from 1987 even being titled "Beyond
> Boolean: ...").
Thanks for the psuedo-cite, I'll track it down and add it to my white
paper trying to explain the point of relevancy ranking in a library
context:
http://bibwild.wordpress.com/2011/03/28/information-retrieval-and-relevance-ranking-for-librarians/
If you have any other such cites, feel free to share (I'm too lazy to do
the research myself right now, or at any rate I don't think it's needed
for the intended audience of the paper). Also interested in your opinion
of my essay in general, Tim.
> Can we, instead if discussing the usefulness of relevance ranking over
> and over again (for, it seems, at least about 25 years, I think, all has
> been said), perhaps just start doing and improving it? I mean, we do in
> some way, driven by products from vendors,
Certainly some of us ARE doing that.
Although I have to admit, I don't think my time is particularly
efficiently spent trying to improve on the relevancy ranking algorithm
itself of lucene -- I'm not a mathematical programming type of guy, and
even if I were I doubt I could improve upon lucene.
Instead, I spend my time (as many of us do) trying to configure the
boosting parameters and such optimal for our use cases and databases.
Naomi Dushay has some slides and a blog post about writing automated
tests for Solr relevancy ranking setup, so when you're trying to tweak
to rank better for new examples, you can know if you're ruining the good
working of your existing examples.
Work is being done.
I think we could usefully spend more time, not trying to improve the
relevancy ranking, but trying to improve the tools and UI we provide for
increasing precision (rather than changing the ranking -- I think the
ranking actually does okay as is) of searches that end up too
low-precision/high-recall even with relevancy ranking. The "facet
limit" tools we all provide are one such technique, but I think we can
make em work better and be more powerful without being more confusing.
I've got some ideas I'll save for another time. (Haven't gotten to em
yet because they will require a bit of Solr Java hacking).
Received on Wed Mar 30 2011 - 16:49:54 EDT