No. You just said the opposite of what I tried to say, Eric, while
implying you were agreeing with me. Clearly I'm having trouble being clear.
What I'm saying is I would not _assume_ that the TF-IDF score
distribution has the shape of a long tail, just-because/even-if user
perception of relevancy assigned to the docs has a shape of a long tail.
A TF-IDF relevancy ranking that puts things in an order that often
matches a user-assigned order, does NOT neccesarily also assign absolute
values that match user-assigned value.
Eric, do you have particular experience with TF-IDF that says you often
get a long tail in the actual scores, not just in the user perception of
value? Because some actual data would be welcome.
I have not looked at my numbers myself. It would take a buncha work to
get some numbers on a graph, that I don't have need of/time for right now.
But on the Solr list, when people ask questions that have that as an
assumption -- like "How can I exclude the 'poorest' scored results from
my result list?" -- the answer from the Solr experts is generally that
you can't just take some arbitrary score as a cut off, because the score
has no objective meaning, and will vary from query to query and index to
index. As I quoted before:
"Scores for results for a given query are only useful in comparison to other results for that exact same query. Trying to compare scores across queries or trying to understand what the actual score means (i.e. 2.34345 for a specific document) may not be an effective exercise."http://lucidworks.lucidimagination.com/display/LWEUG/Understanding+and+Improving+Relevance
Now, that doesn't directly answer the question. Scores MIGHT have a
"long tail" distribution when user perception of value has a long tail
distribution. But I wouldn't bet on it, and I certainly would not
_assume_ it. So far this whole discussion seems, to me, to be just
people assuming it, nobody has any data. It is not a safe assumption.
Relevancy scores succeed at putting things in the right _order_ to match
user perception of value (much of the time, not for 100% of users and
searchers of course). That does NOT mean that the relationship between
individual document scores matches user perception of value.
On 3/28/2011 11:07 AM, Eric Lease Morgan wrote:
> On Mar 28, 2011, at 10:58 AM, Jonathan Rochkind wrote:
>
>> It is true that the _user experience_ of TF-IDF type algorithm ranking
>> is often that you get a few highly relevant results, and then the
>> results trail off into around-equally-non-relevant...
>>
>> Even though your _evaluation_ of relevance might look like: 100, 98,
>> 87, 54, 35, 12, 4, 1, 1, 1, 1, 1, 1, 1, 1,
>>
>> The actual numbers might look like:
>>
>> 100, 70, 69, 68, 67, 66, 65, 64, 30, 39, 28, 27, 26, 10, 9, 8, 7
>
>
> Yes. If I understand the question correctly, then the TFIDF scores associated with any given search result can be described as having the shape of a "long tail", or, put another way, have a Zipfian distribution.
>
Received on Mon Mar 28 2011 - 12:20:31 EDT