SEO Articles

PageRank

July 17th, 2006

The PageRank Algorithm

PageRank extends the idea behind the InDegree algorithm by assigning different weights to the links. Links from high quality pages should make a stronger impact on the rank of a page. Therefore it is not only important how many incoming links a page has, but also how important the pointing pages are.

To determine the authority of internet pages PageRank simulates the behavior of a random web-surfer, by starting its walk from some random page and then following the outgoing links. The starting point is usually drawn from a uniform distribution, but other distributions can be used as well. The process of the random walk is as follows: at the starting point an outgoing link is randomly chosen and the surfer follows it with a probability 1-d. With probability d the surfer jumps to another random page without following any link (“gets bored clicking links”). The parameter d is called “the dumping factor”, its value can vary but generally it is assumed that equals 0.85.

The PageRank of a page is calculated by a formula given below:

PageRank Formula

The algorithm considers a set of interlinked pages M(pi) with the number of elements N. PR(pj) is the PageRank of a page pj in the set, and L(pj) is the number of outgoing links on that page. Each page that passes to pi a fraction of its own PageRank depending on how many other pages it links to.

The incoming links in the set M(pi) can be presented in a form of adjacency matrix:

Modified adjacency matrix

Where l(p1,p2) equals 1 if there is a link between pages p1 and p2 and zero otherwise. PageRanks of all the pages in the set M(pi) form a vector R, which is the dominant eigenvector to the adjacency matrix.

PageRank as an eigenvector

Then the PageRank formula can be written as:

PageRank equation

PageRank values are precomputed and do not depend on search queries. When it is necessary to rank the results for some search term PageRank is used in conjunction with query specific scores. Having the PageRank scores precomputed allows faster results sorting.

References

Digg!

Did you like it? Was it useful? Bookmark or share this post:

Leave a Reply