Last Updated: 3rd December, 2007

Link Spam Detection »

A recently released paper, 'Link Spam Detection Based on Mass Estimation' by Zoltan Gyongyi, Pavel Berkhin, Hector Garcia-Molina and Jan Pedersen introduced the concept of spam mass and its use in detecting pages that deploy link spamming.

This article briefly discusses the background on link building, research in the paper and its impact on link building.

The background on link spamming »

Search engines are host to millions of web searchers daily. Naturally, search engine rankings are in demand.

In order to attain rankings on search engines, website owners are willing to tweak their websites, sometimes risking penalization. Many websites are more search engine friendly and less user-friendly, thanks to the fight for the top ranks.

This defeats the original purpose of search engines (providing access to quality information to the web searcher) and contaminates their search results.

The research paper on 'Link Spam Detection based on Mass Estimation' aims at creating a solution that would help detect such websites that attain high ranking through artificial inflation of links.

The SEO basics »

To put it simply, organic search engine rankings of a website can be influenced by tweaking on-page factors and off-page factors.

One of the off-page factors that have largely contributed to the organic rankings of the website is quantity of inbound links (referred to as inlinks in the paper).

Website owners deploy different methods of link building to influence their organic rankings.

Few of such methods pollute the search results and hence nullify the efforts of search engines to bring up quality webpages. Some of such manipulative tricks are difficult to detect.

The research paper concept:

The paper introduces Spam Mass - a measure of the impact of link spamming on the page's ranking. It also discusses how to estimate the same in order to detect pages that employ link spamming to improve their organic rankings.

What is Spam Mass?

The paper explains spam mass to be a fraction of PageRank that is accumulated from being linked to by spam pages.

Thus, a simplified explanation would be:

Spam mass = PR mass accumulated through spam pages

This would conclude, higher the links from spam pages, higher would be the spam mass.

How the spam mass is calculated?

The paper gives detailed studies on how spam mass is calculated. Aaron Wall has simplified it for all of us through his post on Link Spam detection on SEOBook.com.

Here is another simplified explanation:

Link spamming helps pages accumulate PageRank. The objective is to detect and remove this PR support which is coming from link spamming by evaluating not only the immediate linking partners but all the other pages that directly or indirectly contribute towards its PR.

To do this, the search engines need to first have a database of trusted pages/sites (known to be spam free) and blacklisted pages/sites. These could be computed or manually selected.

Using this core of good and bad sites as the starting point, the sites would be evaluated and the PR contributions of good sites and PR contributions of spam sites (spam mass) towards the target site can be calculated. Comparing both would lead the target site being labeled as a spam or a good site.

The above explanation uses absolute spam mass. The paper promotes the use of relative spam mass, a fraction of a site's PR due to contributing spam sites or pages.

In order to calculate relative spam mass, two different values of PR would be calculated; the regular PageRank and another one which gives more weight to sites that are known to be spam free.

For more details, please read the research paper.

Future of link building

Here are a few conclusive highlights:
1. Dump use of automated softwares to create links
2. Quit copying and creating replicated directories
3. Closely evaluate links partners and their link partners!
4. Links from .edu and .gov sites are good!
5. Don't get too hysterical if you have links from a few spammy sites
6. Get hysterical if all you have is links from spammy sites!
7. Don't be scared of giving links out from your site
8. Don't partner with SEO firms that haven't read the Link Detection Paper or are not top of things!
9. Less high-quality links are better than large number of low-quality links
10. Write articles and syndicate content
11. Become media savvy: links from media are good.

Related reading:

Aaron Wall's post on Link detection research paper.
The research paper on 'Link Spam Detection Based on Mass Estimation'

About the Author:

Avneet Sethi is the co-founder and Director of CueBlocks.com an Internet Marketing firm that helps companies develop and implement successful online marketing strategies.

Copyright

© Copyright 2007, CueBlocks. All rights reserved.

This Article is Copyright protected. Republishing & syndication of this article is granted only with the due credit, as mentioned, retained in the republished article. Permission to reprint or republish does not waive any copyright. The text, hyperlinks embedded on the article and headers should remain unaltered. This article must not be used in unsolicited mail.

Kindly visit 'Republishing articles by CueBlocks' for leaving a feedback, details of republishing guidelines or requesting reprinting of the articles written by the CueBlocks team.
Explore Link Building » Link Building OverviewLink Building Program SpecificationsLink Building FAQsLink Building ArticlesLink Building Enquiry

© Copyright 2005 - 2008 Cue Blocks Technologies Pvt. Ltd. All rights reserved. | Privacy Policy | XHTML 1.0 | CSS | Accessibility Statement | Subscribe through RSS 2.0