Re: The long tail

From: Jonathan Rochkind <rochkind_at_nyob> Date: Fri, 22 May 2009 12:32:33 -0400 To: NGC4LIB_at_LISTSERV.ND.EDU

I'm not sure I buy it either, but it's not clear to me how it matters to 
libraries either way!   But that's the nature of the "long tail theory 
of the internet", whether or not you buy it;  the original article cited 
in this thread was indeed challenging it. 

But does it matter to us either way?

Kyle Banerjee wrote:
>> .........  If the cost to carry
>> (or publish) a million items isn't that much different than the cost to
>> carry a thousand, then you can make money carrying that million items in the
>> long tail.
>>
>> Now that argument applies specifically to electronic items, and is related
>> to the much cheaper marginal cost of electronic items....
>>     
>
> I'm not sure I buy either of these premises. For small text files,
> storing a million items really doesn't cost that much more than a
> thousand. But many digital objects are very large and it costs plenty
> to maintain them. When the nature of the objects is such that metadata
> cannot be automatically derived, it must be added. It takes a long
> time to transfer terabytes of data across networks and involves
> significant bandwidth costs.
>
> Even if maintenance and processing  (i.e. the most expensive part of
> the picture) were free, there's still the problem that when there's
> too much garbage mixed in, the noise level gets high enough that
> finding the good stuff is pretty dang hard.
>
> Ed's suggestion of the two-tier approach has already effectively been
> implemented with the one tier approach. Relevancy ranking is all about
> playing the odds on what makes something important. While you can take
> out factors like use/linkages (crude but reasonably useful quality
> metrics) to bubble up the end of the long tail, you still have a mess
> to sort through.
>
> kyle
>
>