"How many disk reads does it take to catalog an average record? How
quickly do new records get indexed?" Etc.
I don't want to be argumentative, but, simply put, it doesn't matter
anymore. Ingesting, storing and searching tiny text files has become a
trivial task. I'm not saying it's free, but for an organization of
OCLC's scale, it's trivial.
Again, take LibraryThing. We consider ourselves overtaxed parsing MARC
records from the 200+ libraries who regularly upload full dumps. What
does being taxed mean? It means that a virtual server, taking up about
1/3 of a $15,000 box, sometimes gets behind. Boo-hoo for us!
You can assume any multiple you like and the numbers don't work. Let's
imagine, for example, that OCLC needs to do 10,000 times as much
processing as LibraryThing does. (This would imply regular dumps from
2 million member libraries!) Do the math out and you'd need about $5
million dollars worth of servers. Amortize that over a few years, and
it's a percent or two of the OCLC budget. This is an organization that
gains and loses far more from swings in its stock portfolio.
The simple fact is that sending, receiving and processing tiny text
files has become a trivial task. It's been a trivial task for years.
And every year the costs sink still further.
There's no question non-IT tasks, like customer hand-holding,
report-writing, marketing, sales, lawyers, consultants and executives,
still cost money, but to defend OCLC's prices by citing how many
records they store, or how much processing they do, is to be out of
touch with what computers are like today. That argument is dead.
Received on Fri Aug 06 2010 - 23:43:30 EDT