Piotr Cofta (BT Plc) wrote a very interesting paper for the 10th Int. Conf. on Electronic Commerce (ICEC) ’08 Innsbruck, Austria.
It questions the Googleplex as a whole rather than just "Google", and honestly raises some serious issues with it. They have an awful lot of power and we have placed an awful lot of trust in it. To an extent, they rely on this trust to be successful and function.
Here are a few main points:
Google try and monitor everything they can on the Internet to gain as much user data as possible, thus monitoring behaviour. It's important for them to have a stake in every possible interaction mode, not just search for example.
Google focuses on innovation, so it frantically chases top researchers to develop trends and obviously gives out free tools that are used to test and develop thousands of new ideas at the same time.
PageRank and query logs enable Google to identify trends that are likely to stay. It's cheap to run as well. The author reckons that more of this data will be available than is already presented via AdSense.
The trends unsurprisingly are used to fuel the advertising market
People aren't identified during personalisation but computers are. The fact as we all know that you can log in to a tool means that you're signed in to everything. This way an enormous amount of user behaviour data can be captured.
There are endless opportunities for new applications, but people search is super important. Google are tracking people who share similarities, habits and customs. The author qualifies the task as "mathematically trivial", and says however that it is hugely importance for us all.
The author also says that "the Googleplex is not malicious in itself." It's a business. They have a huge amount of power and we need to see if they eventually abuse it of not, despite the "Do no evil" strap line. Interestingly he asks if the Googleplex will be compromised by others. He asks 3 questions: whether the Googleplex can be harmful to individuals, society and social values. People like to develop trust and confidence for organisations like this, this is dangerous to a caertain extent.
He says the the "Crude PageRank value" can be used (as well all know) as the strength socre of a site and also it's reputation score. In a way it must be said, I think, that assigning a numerical value to something is indeed giving something away to the users so that they can feel more satisfied and confident in the organisation. Even if the numbers don't really add up on purpose, it still fulfills its function, as far as the score delivered to the users is concerned. SEO peeps did use it as a measure of success once, and held it as very important. Evidence of this can be seen in the mountains of blogs talking about it.
He says in a totally other way than I'll put it here, that we put trust in the results because we don't know how the whole thing works. I've already said that we cannot exactly know whether the results we are being served up are the exact right ones for us, but rather the best ones they can come up with. Maybe the perfect documents for your needs don't show. people have such confidence that they usually defend Google passionately when this issue is raised. From a scientific perspective though, it is very natural to consider that the results might not be the best. One research project did show that Google didn't perform well at all in comparison to the actual human expert ranking for example.
Do a very very simplistic test and choose your expert area (a very specific one) and rank the most important documents you would give someone on this topic. then check Google and see what you find. More complex tests of course will yield more exact results.
He makes a good point about the "Do no evil thing" by saying that it can't possibly do evil or good because it's fully automated. When there have been mistakes and when people have sporadically written on blog or blog comments about how this can be questioned, the idea dies down quite quickly, because we love and trust Google. It's not their fault that nasty ads get served up or that an update goes painfully wrong, it's the system.
He's right in suggesting that maybe all the free stuff we get does come at quite a high price.
I urge you to read the paper in its entirety, and to do that you'll have to get ACM Digital library access, which is well worth the purchase. It is in my opinion for both marketeers and computer scientists alike a very necessary professional tool.