Monday, April 26, 2004

The Observer | Business | The Networker: What can't you find on Google? Vital statistics

The Observer | Business | The Networker: What can't you find on Google? Vital statistics: "The computing engine that powers Google is the largest cluster of Linux servers in the history of the world...

Wall Street - with its beady eye on the forthcoming IPO - wants to know what Google does (and more importantly, what it plans to do next). Computer scientists, in contrast, want to know how Google does it...

it seems that the overall aim is to understate every aspect of Google's technology and technical performance by several orders of magnitude.

How do we know this? Mainly because of internal inconsistencies in the data provided by Google employees. One university presentation, for example, claimed that Google handled 150 million queries a day, and 1,000 per second at peak times. This prompted Simpson Garfinkel of MIT's Technology Review to do some simple calculations. If the system is handling a peak load of 1,000 queries per second, he reasoned, that translates to a peak rate of 86.4 million queries per day - or perhaps 40 million queries per day if you assume that the system spends only half its time at peak capacity. 'No matter how you crank the math', he concluded, 'Google's statistics are not self-consistent'...

But what it all comes down to is this: Google has far more computing power at its disposal than it is letting on. In fact, there have been rumours in the business for months that the Google cluster actually has 100,000 servers - which if true means that the company's technical competence beggars belief.

Now the interesting question raised by all this is: why the reticence? Most companies lose no opportunity to brag about their technology. (Think of all those Oracle ads.) Is this an example of Google behaving ultra-responsibly - being careful not to hype its prospects prior to an IPO? Or is it a sign of a deeper commercial strategy? The latter is what Garfinkel suspects. 'After all,' 'he says, 'if Google publicised how many pages it has indexed and how many computers it has in its data centres around the world, search competitors such as Yahoo!, Teoma, and Mooter would know how much capital they had to raise in order to have a hope of displacing the king at the top of the hill.' If truth is the first casualty of war, openness is the first casualty of going public."

No comments: