Just an interesting tidbit of information I discovered when preparing my class on Retrieving and Evaluating Electronic Information (here’s my previous post on planning the class). Covering the topic of bias in search engines, and in particular Google, we talked about how PageRank introduces various bias in the type of information it makes available. I assigned as reading the excellent honor’s thesis (pdf, via the Internet Archive) from 2005 by Stanford undergrad Alejandro M. Diaz. Alejandro’s (where are you now? leave a comment if you read this!) thesis is a straightforward, accessible (if not always “scientific”) account of the different bias that are reflected in Google and Page Rank. A sample quote:

Our description of PageRank, like that put forth by its inventors, makes heavy but unqualified use of the term “important.” This is somewhat disconcerting since importance, like relevancy, is a highly subtle, ambiguous, and subjective thing… To the algorithm, being “important” simply means being “popular.”

It is therefore interesting to see how Google itself changed the way they talk about PageRank. Thanks to the Internet Archive, I give you a direct comparison of the text on the official Google “corporate tech” page, highlighted for your reading pleasure and emphasis:

PageRank performs an objective measurement of the importance of web pages by solving an equation of more than 500 million variables and 2 billion terms. Instead of counting direct links, PageRank interprets a link from Page A to Page B as a vote for Page B by Page A. PageRank then assesses a page’s importance by the number of votes it receives.

- Google, 2002 (via the Internet Archive)

PageRank reflects our view of the importance of web pages by considering more than 500 million variables and 2 billion terms. Pages that we believe are important pages receive a higher PageRank and are more likely to appear at the top of the search results.

- Google, 2009

In fact, the change in language, as you can see on the Internet Archive history for the Google Corporate Technology page was done as late as 2007, and to be accurate, sometime between April 6th and May 6th, 2007 – the same month Google has bought DoubleClick (don’t know what this says but conspiracy theorists are welcome to suggest ideas).

  1. Philipp Lenssen

    Google a while ago changed their terminology in relation to “totally objective” in another place too — the following is a special page they advertise when you search for [jew], as a site called JewWatch was once ranking best for that query:

    “Our search results are generated completely objectively and are independent of the beliefs and preferences of those who work at Google. Some people concerned about this issue have created online petitions to encourage us to remove particular links or otherwise adjust search results. Because of our objective and automated ranking system, Google cannot be influenced by these petitions. The only sites we omit are those we are legally compelled to remove or those maliciously attempting to manipulate our results.”

    “The beliefs and preferences of those who work at Google, as well as the opinions of the general public, do not determine or impact our search results. Individual citizens and public interest groups do periodically urge us to remove particular links or otherwise adjust search results. Although Google reserves the right to address such requests individually, Google views the comprehensiveness of our search results as an extremely important priority. Accordingly, we do not remove a page from our search results simply because its content is unpopular or because we receive complaints concerning it. We will, however, remove pages from our results if we believe the page (or its site) violates our Webmaster Guidelines, if we believe we are required to do so by law, or at the request of the webmaster who is responsible for the page.”

  2. naaman Post author

    Nice catch, Andy. Indeed, it seems like PageRank is still “objective” everywhere else, e.g.,
    Israel (Hebrew, you’ll have to trust me on that), France, and of course China (translated).

    Phillip, that’s interesting — and I would contend that even the new version of the “explanation” is not correct… “The beliefs and preferences of those who work at Google,… do not determine or impact our search results.” Of course they do, in some way at least…

