Re: The long tail

From: Kyle Banerjee <kyle.banerjee_at_nyob> Date: Fri, 22 May 2009 08:57:17 -0700 To: NGC4LIB_at_LISTSERV.ND.EDU

> .........  If the cost to carry
> (or publish) a million items isn't that much different than the cost to
> carry a thousand, then you can make money carrying that million items in the
> long tail.
>
> Now that argument applies specifically to electronic items, and is related
> to the much cheaper marginal cost of electronic items....

I'm not sure I buy either of these premises. For small text files,
storing a million items really doesn't cost that much more than a
thousand. But many digital objects are very large and it costs plenty
to maintain them. When the nature of the objects is such that metadata
cannot be automatically derived, it must be added. It takes a long
time to transfer terabytes of data across networks and involves
significant bandwidth costs.

Even if maintenance and processing  (i.e. the most expensive part of
the picture) were free, there's still the problem that when there's
too much garbage mixed in, the noise level gets high enough that
finding the good stuff is pretty dang hard.

Ed's suggestion of the two-tier approach has already effectively been
implemented with the one tier approach. Relevancy ranking is all about
playing the odds on what makes something important. While you can take
out factors like use/linkages (crude but reasonably useful quality
metrics) to bubble up the end of the long tail, you still have a mess
to sort through.

kyle

-- 
----------------------------------------------------------
Kyle Banerjee
Digital Services Program Manager
Orbis Cascade Alliance
banerjek_at_uoregon.edu / 503.999.9787