SEO Articles

Link Analysis Algorithms

July 17th, 2006

Relevance and Authority

When a user queries a search engine with a keyword, he expects more than just relevant results. For example if someone searches for ‘Bali vacations’ he would be disappointed to get a page of a personal blog with a story about John Doe’s awesome Bali vacation last summer. Obviously, what the user was looking for was a travel agent like Expedia. Thus it is critical that users get not just relevant but also authoritative results. And the more pages appear daily in the Internet the bigger is a shift from the relevance to the authoritativeness in search algorithms.

Nowadays the relevance of a page is defined differently. It is not just about the keyword saturation, or copy structure. Currently the context where the page exists defines its relevance. The context is a set of pages linking to or linked by the given page. If these pages are about Bali vacations, then it is naturally to expect a page linked by them to be about Bali vacations as well. The page content would be used to adjust the algorithm’s results in cases when links point to an irrelevant page, for example a widely used free web statistics system.

Link Analysis Ranking Algorithms

So how come the page content analysis is no longer enough to get relevant search results? There is a problem of abundance: the number of pages considered to be relevant basing on the page content analysis is too big for a human to digest. And this is where searching for authoritative pages helps to narrow down the results. But authority is even a vaguer notion than relevance. Authority has to express the importance and the weight of a web document. The nature of the Web as an interlinked hypertext environment suggests that links can be used to measure the degree of “public recognition” of web pages.

This idea has been existing since the creation of the Internet, and Jon Kleinberg was one of the first to create a workable approach, which he described in his seminal work “Authoritative Sources in a Hyperlinked Environment” (1998). He suggested that web pages can be either “hubs” or “authorities”. Authority is a page that has many incoming links or high in-degree. Authorities returned as relevant to some query should demonstrate an overlap in pages pointing to them. Those pages containing links to the relevant resources are called hubs. Hubs determine the relevance of authorities on a given topic and allow discarding other non-relevant pages with high in-degree.

Hubs and Authorities in an interlinked environment

Link analysis ranking algorithms use hyperlink graphs similar to the shown above. The nodes of the hyperlink graph are web pages, and links are the directed edges. The graph is simple: if there is more than one link between two nodes, only one is considered and neither are the links from a page to itself. A different weight can be assigned to edges (links) according to the web page analysis or other factors, which search engines consider important, e.g. link or domain age.

In my further posts I will describe the most widely used link analysis algorithms.

References

Digg!

Did you like it? Was it useful? Bookmark or share this post:

3 Responses to “Link Analysis Algorithms”

  1. Online Marketing Research » Search Engine Ranking Factors List Says:

    [...] Again here age matters: old equals good. The referring page’s authority/hub weight is one of the major parameters in link analysis algorithms. See my previous articles on Link Analysis Algorithms. Posted by oleg.ishenko Filed in Search Engine Optimization [...]

  2. Link Popularity Algorithms | All About SEO Says:

    [...] Link Popularity Algorithms Originally published at Link Popularity Algorithms [...]

  3. Link Popularity Algorithms | Affiliate Tutorial Says:

    [...] Originally published at Link Popularity Algorithms [...]

Leave a Reply