Re: Cablegate from Wikileaks: a case study

From: Jonathan Rochkind <rochkind_at_nyob> Date: Tue, 7 Dec 2010 12:09:26 -0500 To: NGC4LIB_at_LISTSERV.ND.EDU

On 12/7/2010 10:45 AM, Bernhard Eversberg wrote:
> An algorithm in the true sense is an automated procedure that yields
> predictable, reproducible results. As you start to add casuistry based
> on criteria that are invisible and not understandable for the observer,
> it ceases to be an algorithm, or at least it is a procedure that is
> partly (and in Google's case, heavily) influenced by all sorts of

Are you suggesting that Google's code is not an 'algorithm'?  That 
doens't make any sense. ALL software yields predictable, reproducible 
results -- in a theoretical sense.  Including Google.

Some algorithms are so complicated that it may be hard for humans to 
predict exactly what they'll do in a specific given case. Unless the 
humans want to spend a lot of time understanding and performing 
mathematical calculations.  But that's what we have computers for.

Google's algorithm is such, because the problem they are trying to solve 
is very complicated.  If we want our software to work nearly as well as 
Googles, our algorithms will increasingly become more complex too.

If you insist only on simple algorithms, you will get only simple software.

It is worthwhile to point out that Google has their own business 
interests which may not match our interests as users, and that we're 
trusting them, without any easy way to verify, to be dispassionate and 
neutral in their choices.

But it is not reasonable to demand only very simple algorithms for 
taking user textual queries and trying to translate them to a set of 
documents reasonably satisfies most users most of the time.  This is a 
complicated problem.  But again, the fact that each and every one of us 
uses Google every day shows that it is possible to provide reasonable 
solutions to this problem -- but not with facile naive algorithms.