SEO Articles

Popularity Ranking Faults

July 1st, 2006

The influence of search engines and Google in particular on the popularity of web pages is hardly to exaggerate. When a user queries a search engine for a keyword or a key phrase and, having looked through first two or three SERPs, still can’t get desired results, he eventually gives up, and starts looking for something else. When a web page is not in the SERPs - it doesn’t exist.

In the past things were simpler. Few dozens of pages in the search results could be easily sorted according to the keyword density and prominence in the body text and/or meta tags. Now there is a problem of abundance – there are hundreds of thousands, millions of pages to be ranked by relevance and quality. And the notions of relevancy and quality have become very complex.

To measure the relevance and quality of a page search engines employ various algorithms, of which the most widely used are the popularity-based ones. The idea is, the more links point to a page, the more popular this page is, and the higher rank it gets in SERPs. In reality things are not that simple. Ranking algorithms have become quite sophisticated in the evaluation of the page popularity. They are able to detect artificially inflated link popularity, and assign different weight to certain parameters in order to eliminate any possible manipulation of the rankings. Or they can use different ways to determine popularity, like for example Google ranks higher older websites, implying that websites with longer history are well established.

Rich Get Richer

The problem is: can new websites be discriminated in SERPs because of the popularity based ranking systems? If a new website has no incoming links, it can’t be found in search results. If it not in SERPs, no one will find it and link to it. A vicious circle. Conversely – more popular sites become even more popular – ‘rich get richer’.

How do ranking algorithms of search engines impact the website popularity? Is there any pattern of the evolution of popularity? A research performed by Junghoo Cho and Sourashis Roy from UCLA gives the answers.

The research results have shown that the ‘rich-get-richer’ situation does indeed exist. Two snapshots of the same web segment consisting of more than 5 million pages have been downloaded with a time interval of seven months. PageRank and the number of incoming links were calculated for each snapshot. Then the pages were divided into ten groups according to their popularity.

Absolute increase in the number of incoming links

Fig. 1 Absolute increase in the number of incoming links. Source: J.Cho and S.Roy (2004)

The popularity groups are plotted on the X axis as the percentage of the total number of pages in the snapshots: 90-100% is the most popular group, 80%-90% - the second most popular, etc. The difference between the numbers of incoming links in both snapshots is shown on the Y axis.

It turns out, that the bottom 6 groups’ change in popularity is almost zero, while the top two groups obtain 70% of the total number of new incoming links.

Absolute increase in the PageRank values

Fig. 2 Absolute increase in the PageRank values. Source: J.Cho and S.Roy (2004)

The PageRank measure of popularity shows even a more curious picture. While for the top groups the PR has increased, the less popular groups have lost some of their PR.

Therefore indeed given the popularity-based ranking system such as PageRank, the more popular pages get more popular, while unpopular pages become even less popular.

Popularity Evolution: With And Without SEs

But what if there were no search engines, just random web-surfing from one link to another, would a new and not yet popular page have more chances to increase its popularity comparing to a situation where search engines exist?

Under such ‘random surfer’ model the popularity of a web page is defined as a fraction of total web users who like the page. Say if 100,000 of 1,000,000 currently like the page, then its popularity P(t) will be 0,1. Another notion defines so-called ‘visit popularity’ V(t) – a number of visits that the page gets within the time interval t. V(t) is proportional to the page’s popularity P(t):

V(t) = rP(t),

where r is a normalization constant.

Using the above propositions J.Cho and S.Roy have derived the popularity evolution function, which appears to have an S shape. According to this function the popularity of a web page passes through three stages – infancy, expansion and maturity.

Popularity evolution under the random surfer model

Fig. 3 Popularity evolution under the ‘random surfer’ model. Source: J.Cho and S.Roy (2004)

This pattern can be verified empirically. The Google’s own popularity evolution has the same shape. Google.com is a website which popularity is probably the least influenced by search engine rankings; therefore it fits well for the ‘random surfer’ model. According to the Nielsen Net-Ratings data, Google’s popularity evolution 1998 looks like this:

Google’s popularity evolution

Fig. 4. Google’s popularity evolution. Source: J.Cho and S.Roy (2004)

However, for the majority of pages in the web the popularity evolution is significantly influenced by their ranking in the search engines. Using empirical data J.Cho and S.Roy have derived that under the search-dominant model the proportion between V(t) and P(t) looks different:

V(t) = rR(t)9/4

where r is normalization constant.

Now how much traffic would a less-popular page get comparing to a more popular one in both models? Let one page’s popularity be P1(t) 0,9 and another one’s – P2(t)= 0,1. Then according to the relationships between popularity and the number of visits we will get:

  • ‘Random Surfer’:
    Random Surfer
  • ‘Search-Dominant’:
    Search-Dominant

The difference is huge. However in the second case there is an influence on traffic not only by the actual popularity but also by the position in the SERPs. The probability of a user clicking a higher position link is bigger than clicking any link below.

Again, as it was previously done for the ‘random-surfer’, J.Cho and S.Roy derive a popularity evolution function for the search dominant model. The graphic of this function looks like this:

Popularity evolution under the ‘dominant-search’ model

Fig. 5. Popularity evolution under the ‘dominant-search’ model. Source: J.Cho and S.Roy (2004)

A closer look to the right side of the graph allows to see that it is also S-shaped and passes the same three phases.

Popularity evolution under the ‘dominant-search’ model

Fig. 6. A closer look into popularity evolution under the ‘dominant-search’ model. Source: J.Cho and S.Roy (2004)

Note, that under the ‘random-surfer’ model it takes only 25 units of time to reach the maturity phase, while under the ‘search-dominant’ it takes 1650 – 66 times more. Another difference – there is a very short expansion phase in the ‘search-dominant’ model. The popularity in this case grows almost instantaneously. Possible explanation: once a page has obtained enough ranking to be shown in the top positions in SERPs, it starts receiving more traffic and consequently more incoming links, resulting in a higher ranking, more traffic and so on.

Conclusion

The picture is quite gruesome. Because of the popularity-based ranking, some recently created pages with high quality content, but low initial popularity, will (almost) never make their way into top search results. Unless some special measures are taken to receive some quality incoming links, these pages will never be seen by any web user.

References

Did you like it? Was it useful? Bookmark or share this post:

4 Responses to “Popularity Ranking Faults”

  1. Elaine Smith Says:

    Great post! Thanks.

  2. Online Marketing Research » Link Popularity Building Strategies and Tips Says:

    [...] Thus, links rule the Internet. Once a routine task of a webmaster, link building has emerged itself into a full scale industry with millions of dollars in turnover. Ranking algorithms perceive links as a proxy for a human judgment, or a user’s positive endorsement of a page. The idea is as follows: a user discovers a page, likes its content, links to the page, and the page gets higher ranking. This is the so-called ‘natural way’ of acquiring links. The natural way of acquiring link works is too slow and can be pretty unfair. New pages on big and established websites are far more likely to be discovered by web users, and these pages will get the major part of the new links (like 90%); while new pages on fresh sites will get trinkets. This is a serious defect of the link ranking system which is discussed more in details in my article Popularity Ranking Faults. [...]

  3. Etudes Web et référencement / Les limites du positionnement par calcul de popularité Says:

    [...] Popularity Ranking Faults Publié le 01/07/2006 [...]

  4. Link Popularity Building Strategies and Tips « Link Building Strategies Says:

    [...] Thus, links rule the Internet. Once a routine task of a webmaster, link building has emerged itself into a full scale industry with millions of dollars in turnover. Ranking algorithms perceive links as a proxy for a human judgment, or a user’s positive endorsement of a page. The idea is as follows: a user discovers a page, likes its content, links to the page, and the page gets higher ranking. This is the so-called ‘natural way’ of acquiring links. The natural way of acquiring link works too slow and can be pretty unfair. New pages on big and established websites are far more likely to be discovered by web users, and these pages will get the biggest part of the new links (like 90%); while new pages on fresh sites will get trinkets. This is a serious defect of the link ranking system which is discussed more in details in my article Popularity Ranking Faults. [...]

Leave a Reply