SEO Articles

Supplementary indexIn one of my recent posts I wrote about the duplicate content issue. This topic is especially important to me since my blog uses the WordPress content management system which, when used with the default configuration, is not duplicate content proof. In fact this CMS is capable to render almost 100% of your content duplicate. As usual the fault of the system has roots in its advantages. WordPress has many features facilitating blogging and linking, such as RSS feeds to posts and comments, trackback URLs, monthly archives and so on. In the same time this variety of URLs returning similar or identical pages represents a clear case of duplicate content.

WordPress And Duplicate Content

The first evidences of duplicate content produced by your WordPress CMS can be found in your sidebar. They are category pages and monthly/daily archives. Category pages store your articles posted under the same topic – a category. Such pages have no unique content; they are just a collection of your previous posts. Monthly and daily archives also simply group your previous articles by the date of posting. Sometimes when you have only one post in a given day, the archive page for the date and your post are totally identical.

The next case of duplicate content is even more prominent. It can be your home page itself. If it contains not excerpts but the full text of your posts, then it duplicates your post pages. This also applies to the ‘next/previous entries’ pages – those accessible via /page/2, /3, /4 etc.

Feeds. Search engine spiders crawl all the content they can reach and of course this includes RSS feeds too. The additional problem with them is that Google may choose to display your RSS URL in the search results over the link to the original post. In this case the user who clicks this result will see an XML formatted page which is not ‘human-friendly’.

Trackback URLs. Many WordPress templates add trackback links after posts. This links enable authors to track who links to their posts. Usually, if your post URL looks like ‘www.yoursite.com/2006-11-30/yourpost/’ its trackback URL will be ‘www.yoursite.com/2006-11-30/yourpost/trackback/’.

Identical meta-description. By default WordPress doesn’t provide a tool to add unique meta description tags to your posts, and they either have none or share a single site-wide description. Having no meta description at all is a disadvantage, as a properly written one can make your snippet stand out in a SERP. Having an identical description for all your pages is a threat, as Google might get them filtered out as too similar. (see a thread here)

Because of the duplicate content Google search can return less desired URLs (such as feeds or archives instead of original posts); your pages can be moved out of their index, or placed into the supplemental results, which are rarely displayed to users.

Solving the Duplicate Content Issue in WordPress

Adding ‘noindex, follow’ tags

What can you do to avoid this problem? You can tell the search engines what URL to index by using ‘noindex, follow’ meta tag, robots.txt exclusions or 301 redirects. Let’s say you want Google to index your front page, posts, single pages and category pages and forbid the spiders from crawling the content of archives, feeds and ‘next entries’ pages - page/2, /3, … To do this you have to add to your header.php the following code:

Code:
     if((is_home() && ($paged < 2 )) || is_single() || is_page() || is_category()){
echo '<meta name="robots" content="index,follow" />';
} else {
echo '<meta name="robots" content="noindex,follow" />';}

For those not familiar with editing templates in WordPress: in your dashboard click Presentation menu item and after the new page is opened – click Theme Editor. In the Theme Editor choose ‘header.php’ and then paste the above code into the editor form. This code has to be inserted anywhere between head tags .

Here the tag is added to the home page but not the ‘next entries’ page (is_home() and ($paged<2)), to your posts (is_single()); to solo pages, like ‘About me’, if you created any (is_page()); and to category pages (is_category()). If you don’t want your categories to be indexed just delete || is_category(). All the other pages will get . They will not be indexed, but this will not prevent crawlers from following their outgoing links.

Adding unique meta description

For this purpose I use Head Meta Description plugin. This plugin can be configured to use an excerpt of your post as a meta description – this is especially useful if you have to add this tag to hundreds of existing pages. Or you can add your own manually as a custom field, which is my personal preference.

Using more tag

By using this tag you tell WordPress to display only the first few lines of your post. This greatly reduces the similarity of home page and your articles. If you have too many existing posts to edit, you can use an ‘excerpt’ plugin, such as this one from Semiologic

Redirect to a canonical URL

You should edit your .htaccess file to perform 301 redirects. Non-www addresses like yoursite.com should be redirected to www.yoursite.com. URL without trailing slashes like www.yoursite.com/category should be rewritten to include it: www.yoursite.com/category/ This can be done by inserting the following code into your .htaccess file:


RewriteEngine On
RewriteCond %{HTTP_HOST} !^www\.yoursite\.com$ [NC]
RewriteRule ^(.*)$ http://www.yoursite.com/$1 [R,L]
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]

For more details I advise you to read this: the process or rewriting the URL layout.

Preventing spiders from crawling feeds and auxiliary pages

For this purpose you should edit your robots.txt file by inserting the following code

User-agent: *
Disallow: /wp-
Disallow: /search
Disallow: /feed
Disallow: /comments/feed
Disallow: /feed/$
Disallow: /*/feed/$
Disallow: /*/feed/rss/$
Disallow: /*/trackback/$
Disallow: /*/*/feed/$
Disallow: /*/*/feed/rss/$
Disallow: /*/*/trackback/$
Disallow: /*/*/*/feed/$
Disallow: /*/*/*/feed/rss/$
Disallow: /*/*/*/trackback/$

Another two practical tips

Some people find it useful to restrict the number of posts displayed in your home page to 4-5, as less posts are duplicated.

A great article on customizing the more tag in Wordpress.

To Sum Up:

  • To avoid the duplicate content issue in WordPress include you should do:
  • Add ‘noindex, follow’ meta tag to your monthly/weekly/daily archives, ‘next entries’, and if necessary, category pages
  • Ensure that all your pages have unique meta-description tags
  • Set up 301 redirects for your non-www URL and URLs without trailing slashes
  • Restrict search engine crawlers from indexing your feeds and trackbacks
  • Use more tag to show excerpts in your home page instead of full posts
  • Restrict the number of posts displayed in your home page

References:


Did you like it? Was it useful? Bookmark or share this post:

76 Responses to “How to Make a WordPress Blog Duplicate Content Safe”

  1. reseller hosting Says:

    These are really good points to look upon and certainly I will implement few of them on my wordpress blog ASAP.

  2. SmartNerd Says:

    Thanks, I found this usefull for my own blog, especially the extra robots.txt file additions and the canonical URL redirection script.

  3. Cameron Turner Says:

    Excellent article. I’ve been looking for a meta-description plugin. Thanks for point it out!

  4. Andre SC Says:

    Thanx, well writen usefull resource!
    B.t.w. assuming the code for the header index or not php will most likely be inserted it may make sense to include the php open and close tags as well as the closing } of the else block in the code example :-)

    Nonewithstanding just tagged this into stumble upon, hope you get lots of admiring traffic !

    WordPress Rocks!

  5. oleg.ishenko Says:

    Thanks Andre - indeed a lot of stumblers visited this page after you promoted it!

  6. Andre SC Says:

    Kewl :-), I’m glad, its a good piece of work.

  7. PixelPLEXUS » Blog Archive » How to Make a WordPress Blog Duplicate Content Safe Says:

    [...] » How to Make a WordPress Blog Duplicate Content Safe rest of the research blog also well worth a scan. Oh, and the author seems like a realy nice guy. [...]

  8. Search Marketing Facts » How to Make a WordPress Blog Duplicate Content Safe Says:

    [...] In one of my recent posts I wrote about the duplicate content issue. This topic is especially important to me since my blog uses the WordPress content management system which, when used with the default configuration, is not duplicate content proof. In fact this CMS is capable to render almost […]Read full entry [...]

  9. SigT Says:

    Cómo arreglar el problema del contenido duplicado en WordPress…

    El tema del contenido duplicado se está comentando bastante por la blogosfera, — sin ir más lejos, yo mismo hace un par de días — y hoy voy a centrarme en WordPress.

    Lo primero, decir que Online Marketing Research ha escrito How to Make…

  10. www is depreciated here | The Texas Blog Says:

    [...] I am trying to clean up my duplicate WordPress content here at ecrosstexas.com. [...]

  11. Philip Tiangson Says:

    thanks for the helpful post..i really appreciated…

  12. Alex Says:

    Thank you for this! I’ve been looking all around for a decent article on the best methods relating to WordPress and spiders, and this is everything wrapped up in one:)

  13. WordPress y las penalizaciones de Google por contenido duplicado - Fernando Gomez - wordpress - dominios - proyectos web - monetizacion web - seo Says:

    [...] Lo que mejor resume lo que he contado y recoge muy buenas aportaciones está en SeoResearcher . Nosotros empezaremos a probar todo esto desde ahora mismo y controlaremos los logs, el paso de robots por los blogs, indexación, etc y os iré contando si se solucionan los problemas. [...]

  14. Wordpress y las penalizaciones en buscadores por duplicados - carrero Bitácora de los Hermanos Carrero, David Carrero Fernández-Baillo y Jaime Carrero Fernández-Baillo. Todo sobre Internet, Tecnología, Tendendias, Dominios, Bitácoras, Diseño y Progr Says:

    [...] Os recomiendo los siguientes artículos relacionados a este tema: Sigt no dice Cómo arreglar el problema del contenido duplicado en WordPress. DupPrevent, Plugin para WordPress para controlar contenido duplicado. Fernando Gómez nos comenta sobre WordPress y las penalizaciones de Google por contenido duplicado. Seo researcher: How to Make a WordPress Blog Duplicate Content Safe. (enlace a cache de Google). Mariano nos habla también de El contenido duplicado en Google. [...]

  15. WordPress Tips and Strategies » Webomatica Says:

    [...] Here’s a run down of some strategies you can try. I’ve done the “noindex, follow” thing for my category pages on this blog. If you click on a category in the sidebar and do “view source” there is a meta tag instructing a search enging to not index this page, but follow the links so it finds the actual post pages and indexes them. [...]

  16. GeneaBabble Says:

    I just found this article after getting nailed by G for duplicate content. I already knew about most of these things, but to have them all in one place made my job a lot easier. Thanks!

  17. stuey web design Says:

    Hi Andre,

    This is great information. Thank you very much. I have been suffering from duplicate content in the SERPS and hoping that this will solve it!

    Thanks for the valuable post.

  18. muztagh Says:

    Thank you very much fro your detailed input, this problem you are pointed out has been a big issue for our site

  19. Aaron Says:

    This a great article - very helpful stuff!

    I’ve had a couple of first page searches completely dissapear off Google due to this duplicate content issue. Thanks for all the tips here.

    A couple of questions:
    1) This problem seems to be exclusively a Google problem (from my experience) - therefore in your Robot.txt suggestion, would it be wiser to specify Googlebot rather than blanket-disallowing all bots?

    2) Do you have any information how long it takes for your SERPs to recover after you implement these changes? I want my traffic back :(

    Thanks

  20. Laptop Tech Says:

    Thank you for this post, it’s great!
    I agree with Andre SC, here’s an example from my Wordpress blog:

    ‘;
    } else {
    echo ”;
    }

    ?>

    I excluded Categories too. Insert this code between and in your header. Works great!

  21. Laptop Tech Says:

    Sorry, for some reason the code is invisible in the comment.

  22. miLienzo.com » WordPress, duplicate content and the power of Google Says:

    [...] Fear not though, there are ways to make WordPress search engine friendly. I’m not going to post a detailed technical ‘how to’ but I strongly recommend all WordPress users take a look at this excellent article on SEO Marketing Research. [...]

  23. Matt Ellsworth Says:

    This is a great post - I’ll definitely be taking some of it into effect in the blogs we run.

  24. A Threat to Your Wordpress Blog: Duplicate Content » Web Marketing News Says:

    [...] Full version of the article including tutorial can be found here: Making Your Wo. (continues) [...]

  25. » duplicate content issue Says:

    [...] Langsung saya ambil dari sumbernya, maka tulisan berikut ini mungkin bisa membuat para pengguna wordpress semakin hati-hati dengan duplicate content atau situs anda akan miskin halaman di search engine seperti google yang nampaknya semakin ketat terhadap para plagiat. [...]

  26. xt commerce blog - Freier Holger Says:

    good article - i have find a plugin to avoid dc
    here the link: http://seologs.com/duplicate-content-cure/

  27. Zeb Says:

    Question… I’ve implemented similar actions on my WP blogs before but in looking at your robots file I’ve thought of question in relation to side effects of not allowing gbot to index your feeds.

    a ton of my traffic comes from google blog search. I should probably know this already, but where does G get their blog data from? are they watching the ping sites? are they watching feeds? obviously the post on the site is what is indexed, but what are they monitoring to get updates?

    though I guess my sitemaps submission would get around any issues there. thoughts?

  28. cormski Says:

    great post - much of it is irritatingly obvious when you spend a minute thinking about it - but fantastic pulling it all together in one place.

    Now just need to find the time to do this and the 101 other housekeeping tasks that blogging demands.

    Best

    cormski

  29. Blog Informático » Contenido duplicado en Wordpress Says:

    [...] Leyendo en un artículo de seorechearser.com titulado How to Make a WordPress Blog Duplicate Content Safe, encuentro más prevenciones para evitar el contenido duplicado en Wordpress. [...]

  30. Wo kommen plötzlich die vielen Kategorien her? » Kinderwunsch-News Says:

    [...] Ausserdem ist mir aufgefallen, dass der Blog bei Google nur noch schlecht in den Suchergebnissen gelistet wird, vermutlich wegen “duplicate content”, weshalb ich hier einige Kleinigkeiten geändert habe, die dem einen oder der anderen Leserin aufallen werden. Mit was für einem Blödsinn man sich manchmal beschäftigen muss… var random_number = Math.random(); if (random_number < .5){ } else { } Möglicherweise verwandte Beiträge:Her mit den LinksLinktipps vom 10. 12. 20062. Geburtstag verpasstBiologische Uhr mit einfachem Test ablesbarSie entscheiden: Die besten Kinderwunsch-Blogs [...]

  31. Chinese SEO Says:

    Although Google says it would identify the duplicated content. I think it is always nice to make it easy for Google bot. As you mentioned, add ‘no-follow’ is a solution to dulplicate contents.

  32. Adam Says:

    This is a great rundown of what needs to be done! Thanks for taking the time and gathering all the data you’ve done and/or what others are doing.

    I’ve started to implement several of the tactics you’ve outlined above. I do wish there was an easier solution, like some sort of plugin for getting rid of the duplicants, based on the type of permalinks you select. Actually, I’m surprised no one has done that yet.

  33. RaymonWazerri Says:

    Hey,
    I love what you’e doing!
    Don’t ever change and best of luck.

    Raymon W.

  34. belajar wordpress Says:

    nice article friend…. I would use tips in this article to my blog site too.

  35. ariff mahmood Says:

    Permalink can solve this duplication problem. Use the post id as a permalink then the search engines will see you less suspicious.

  36. San Diego RE Says:

    Great information. But, why not just use a specific \\\’no-follow\\\’ tag for Goggle?

    San Diego RE

  37. Qais Al-Khateeb Says:

    Very interesting post, it clears up many issues about duplicated content in CMS and how to handle it in professional way. Again, thanks for this great post :-)

  38. Get rid of your duplicate content [FitForFreedom] Says:

    [...] Is WordPress is a problem? WordPress is giving you several capabilities of managing the content of your site and can be considered being some kind of content-management-system. But some of these features may cause trouble for you as WordPress could render almost 100% of your content as dublicate content. Oleg Ishenko of seoresearcher.com has written a very detailed article on how to make your wordpress blog duplicate content safe. It is definitely worth reading if you are running WordPress and are new to the duplicate content issue. [...]

  39. Net Writing » How Many Categories to Use in a blog Entry? Says:

    [...] There is a good post here on avoiding duplicate content issues with wordpress; many ideas are useful for other blogging platforms. [...]

  40. » SEO for WordPress Part II Says:

    [...] This is the second part of the essential SEO tips for WordPress blogs covering the topics of Google Sitemaps plugins, pings and ping servers, valid (X)HTML, importance of a layout that puts post content ahead of sidebars and navigation, and displaying post excerpts and teaser text on the home page. You should also check out other articles relevant to the SEO for blogs: How to Make a WordPress Blog Duplicate Content Safe and SEO for WordPress Part 1 [...]

  41. How to Not Land in Google Hell (Part 3 of 3) - Idiot Affilate Says:

    [...] *source-seoresearcher.com*  [...]

  42. Kiran Konathala Says:

    Hello all

    Very informative post. Please help this poor student with a BIG problem. I migrated from blogger to wordpress few days ago and I had 57 very successful,good,unique posts in Blogger and I moved those posts to Wordpress. Now if they reside on both the places - then its duplicate,right? How to avoid this? If I delete my blogger blog, I may take a lotta time to impress search engines!! Is there any way to ‘noindex’ only those 57 posts in my wordpress blog? Can I add a new category for those 57 posts and then noindex them in the header?

    Thanks for reading this far!! Hope my issue is resolved :)

  43. Duplicate Content vermeiten mit robots.txt - Punctilio Says:

    [...] Wer jetzt auf Nummer Sicher gehen will, sollte sich die Anleitung “How to make a WordPress duplicate content safe“. Ein erstes Grundgerüst liefert auch der robtos.txt Generator. [...]

  44. PHP MySql Programmer / Developer Says:

    I’ve read there is a plugin that takes care of wordpress archives becoming duplicate content.

    I do not recall it’s name however that would make things simpler.

  45. Gerd-E. Says:

    Very helpful, because Google looks for duplicate content and “bad” description and title-tags. Thanks.

  46. New WordPress Blogs: 12 Steps to Set Up for Success << Vandelay Website Design Says:

    [...] Duplicate content can be major challenge for blogs and search engine traffic. Blogs by their nature create duplicate content on category and date archives, as well as on the blog home page. Search engines, especially Google, go to great lengths to keep duplicate content out of the search results, which means less traffic for you if they think your pages are repeating the same content.  This issue is much too large to cover in depth here (for additional reading see the Google Webmaster Central Blog, SEOResearcher.com, and SEOmoz) but there are a few simple things that you can do to lower the risk of having your pages flagged as duplicates.  [...]

  47. Marc Says:

    Nice!

  48. How to avoid duplicate content on Wordpress Says:

    [...] Salutes to the man - indeed a great article! [...]

  49. Kenneth Says:

    Just a short note to say I like your blog.

    Good job and keep up the great work!

    Kenneth

  50. Alban’s blog » Blog Archive » Simple Robots Meta Wordpress plugin Says:

    [...] Using recommendations like How to Make a WordPress Blog Duplicate Content Safe, my objective is adding the “noindex,follow” to every page except posts, home and categories pages. Paged home and categories pages (/page/2, …) should be tagged too. [...]

  51. Get rid of your duplicate content » MarcoRichter.net Says:

    [...] Is WordPress a problem? WordPress is giving you several capabilities of managing the content of your site and can be considered being some kind of content-management-system. But some of these features may cause trouble for you as WordPress could render almost 100% of your content as dublicate content. Oleg Ishenko of seoresearcher.com has written a very detailed article on how to make your wordpress blog duplicate content safe. It is definitely worth reading if you are running WordPress and are new to the duplicate content issue. [...]

  52. jim spencer Says:

    Great examples. How long will you keep the 301 Redirects in place? I would think that after 4 to 8 months there would no longer be a need for them.

  53. Phoenix Criminal Lawyer Says:

    no follow tags, 301 redirects, sure seems like a bunch of work. I think there maybe a WordPress plug in that does all this for you.

  54. azam zaki Says:

    Im having a phobia adding noindex nofollow tag since blogger had the same problem with Google indexing previously.anyway its a good point im gonna try it. good tips man

  55. Wordpress club Says:

    [...] For tips how to get rid of the duplicate content in Wordpress please refer to my tutorial: Making Your Wordpress Blog Duplicate Content Safe [...]

  56. Duplicate content, how to make your blog a safer place for Search Engine spiders. | Bare Fly.com Says:

    [...] I recommend you to read the post How to make your wordpress blog duplicate content safe, written by Oleg Ishenko, the owner of SeoSearcher.com. It’s an amazing read. [...]

  57. heelys Says:

    I’m using semiologic and wordpress as a cms, so I don’t have any issues with duplicate content on categories, archives, etc..

    But what I found is that google is indexing search querys done in my site, so I have duplicate content there..

    So I have urls indexed like http://www.domain.com/?s=query
    Why is that happening, and how do I stop that?? Using robots.txt??

    Great blog and good tips.. thanks

  58. Idetrorce Says:

    very interesting, but I don’t agree with you
    Idetrorce

  59. Maximus Says:

    I would like to see a continuation of the topic

  60. Davey Blog Says:

    I’m getting ready to switch to Wordpress and this is interesting.

  61. Hire PHP developer programmer Says:

    Very informative post. It clears up many issues about duplicated content and how to handle it in professional way. Google looks for duplicate content, description and title tags.

  62. Jan Says:

    Very informative post. I will use some of the stuff for my blog. For your site it seems to work since it was the on top in my Google Search.

    Thanks

  63. Amit Nyamtabad Says:

    Really very nice article on dup content. I really liked the way you have blocked the bot access from the .htaacess file.

  64. ordination Says:

    It’s good information. Still I would rather just write then fool around with all this code stuff. Really some one should figure a way to do this automatically.
    thanks

  65. Do search engines index the same page more than once? | blog.netweblogic.com Says:

    [...] Rather than explain it all, Oleg Ishenko does a very good job showing How to Make a WordPress Blog Duplicate Content Safe with some very practical ideas to improve your situation with regards to search engine ranks. [...]

  66. Archie Says:

    This article is great, I was having some indexing problems myself thanks to my feed getting indexed before my blog, this post is fantastic. Thanks for all the tips.

  67. Markus Says:

    Just what I was looking for right now. These are excellent tips and very easy to understand. Thanks!

  68. Germanische Mythologie Says:

    That is very nice. Thank you very much.

  69. Webagentur Says:

    This article like me very well and was also very helpful. Thank you!

  70. Seo Valencia Says:

    Great article. Thanks!

  71. petnos Says:

    before starting a new project this post is like fresh air for me. thanks.

  72. Billigflug Says:

    I, too, had this article very helpful. Well it some good blogs.

  73. Manish Says:

    Thanks for the great info. I was really wondering how to deal with it and avoiding many unique pages to go to supplemental pages. Well, thanks for the info. I’d also tried category killer plugin, and I think it also worked, but couldn’t tell for certainity what is working as google is indexing my pages so slowly.

  74. Gabfire web design » Wordpress duplicate content problem Says:

    [...] has very detailed and great post about this issue. Worth to check it out. This entry was posted on Wednesday, January 16th, [...]

  75. Web Proxy Says:

    That was really a very interesting read. But I got a few questions to ask.

    What about tag archives? Are they considered as duplicated content too? Should I prevent googlebots from indexing them as well?

  76. UK Insurance News Says:

    Good points even if this post is a little bit old. I’ve been getting used to using Wordpress for a few months only but my question is similar to the commenter above.

    What about tags? I see Google indexing not only my main pages but if I have 10 tags per post then that’s another 10 pages, theoretically, that Google is indexing but, it’s obviously duplicate content barring 1 word, the tag itself.

    Any tips?

    Thanks for the write up :)

Leave a Reply