
How to Avoid Duplicate Pages?
Una delle Penalità che Google può dare ad un sito è basata sulla eventuale presenza di Pagine Duplicate.
Cosa fare nel caso in cui un sito abbia pagine duplicate ovvero pagine che differiscono solo per l’url ma che presentano lo stesso contenuto?
Ecco una lista di consigli su come evitare tale problema e quindi migliorare l’indicizzazione delle pagine di un sito su Google.
One of the Penalties that Google can give to a site is based on the eventual presence of Duplicate Pages.
What to do if a website has duplicate pages or pages that differ only in the url but with the same content?
If you search your websites or other websites on google you can find different pageg with the same content. The difference it’s just the url, for example:
yoursite.com www.yoursite.com yoursite.com/index.htm
or wiki pages…
wikiname.com/product1wikiname.com/article/product1wikiname.com/category/product1
We must avoid this situation because Google takes his time to check this duplicate pages and not for new pages or new contents on our website.
Here is a list of Tips on how to avoid this problem and thus improve the indexing of pages of a site to Google.
(click on photos for large version)
Use Google WebMaster Tools
If you do not already know them, are tools provided by Google to help webmasters in the process of indexing their websites.
URL > http://www.google.com/webmasters/tools/
After the login you can select your website from the list and manage it.
In the Side Menu there’s the Diagnostic submenu, this provides you suggestions and warning about you website. For example, Duplicate Meta Descriptions and Duplicate Title Tags. The Diagnostic Tool provides you the url with the duplicate tags so you can check it and solve the problem.
Delete the Duplicate Pages from the XML Sitemap
Check your Sitemap and if there’s the URL of a Duplicate Page, delete it and be sure that only the original versione of the page remains on the XML Sitemap!
Do not do this
<url> <loc>http://www.yourname.it/</loc> <priority>1.00</priority> <lastmod>2009-12-27T14:22:55+00:00</lastmod> <changefreq>daily</changefreq> </url> <url> <loc>http://www.yourname.it/index.html</loc> <priority>0.80</priority> <lastmod>2009-03-11T14:22:55+00:00</lastmod> <changefreq>daily</changefreq> </url>
...
and do this
<url> <loc>http://www.yourname.it/</loc> <priority>1.00</priority> <lastmod>2009-12-27T14:22:55+00:00</lastmod> <changefreq>daily</changefreq> </url> <url> <loc>http://www.yourname.it/category1.html</loc> <priority>0.80</priority> <lastmod>2009-03-11T14:22:55+00:00</lastmod> <changefreq>daily</changefreq> </url>
...
The Error of the Index Page
In a webpage we have the Main menu with the link for the home page, the Logo and the Heading element H1. These have the link for the home page; now remember that you can see the home page in different ways:
yoursite.com
www.yoursite.com
yoursite.com/index.htm
The Tip it’s don’t do this:
<a href="index.html" title"your title" name="your-title">
but do this
<a href="/" title"your title" name="your-title">
…and also in this case, remove index.html (index.php, or index.asp, etc…) from your XML Sitemap and insert only www.yoursite.com
HTTP 301 Status Code
If you have one or more pages with the same contents you can choose to redirect them to the original page using the 301 Status code, a simple and useful redirect.
The message that receives Googlebot it’s “Moved Permanently“, so it will visit only the original page.
You can use it in different ways:
Redirection with META Refresh (the easiest way!)
<META HTTP-EQUIV="REFRESH" CONTENT="0; URL=http://www.website-name.com/original-page.html">
Redirection with Javascript
<html> <head> <script type="text/javascript"> window.location.href='http://www.website-name.com/'; </script> </head> <body> This page has moved to <a href="http://www.website-name.com/">http://www.website-name.com/</a> </body> </html>
HTTP 301 Redirect in PHP
<?php
// Permanent redirection header("HTTP/1.1 301 Moved Permanently"); header("Location: http://www.website-name.com/"); exit(); ?>
For the complete list of the Permanent Redirect Methods (perl, cold fusion, asp, etc…) >Permanent Redirect with HTTP 301
…but the 301 Redirect it’s not the only possible redirect!
The Canonical Page
If you have duplicate pages you can choose to add in the head section this code:
<link rel="canonical" href="http://www.website-name.com/original-page.htm"/>
You can use it for relative or absolut links. The canonical link it’s a suggestion for Googlebot, not a directive.
Remember: use it only if you can’t delete the duplicate pages or the content of these. The content of the pages must be identical!
Duplicate Pages and Robots.txt
An other simple tip to solve the problem of the duplicate pages! In robots.txt you can choose the directories that Googlebot does not follow.
How do this? It’s simple!
User-Agent: * Disallow: /directory/subdirectory/ Disallow: /directory/file.html Allow: /
In this way Googlebot doe’s NOT follow /directory/subdirectory/ and /directory/file.html but follows the others. With Google Webmaster Tools you can automatically generate your robots.txt in a few clicks.
For more informations about robots.txt visit the official website: http://www.robotstxt.org/
Comments/Suggestions are welcome!
Follow us on Twitter for Extra-News and Resources!











This article really helped me. Thank you! This is an awesome resource for identifying and remedying problems and pitfalls we all have with our web properties, from time to time. Even if someone were to have a doctorate in Internet marketing, if one were available, it would be impossible to keep up with all of the changes and the technological advances…not to mention the constant algorithmic changes Google dreams up and implements. Well done!
Professor John P. J. Zajaros, Sr.
The Internet Marketing Quest Revealed
The Ultimate Internet Image
SEO Tips – How to Avoid Duplicate Pages?…
One of the Penalties that Google can give to a site is based on the eventual presence of Duplicate Pages.
What to do if a website has duplicate pages or pages that differ only in the url but with the same content?
…
Bookmarked your article, thanks! regards, pp
Superb Blog here.. I been using Akisment for my wordpress blog and wondering if it does a good work of protecting spam as many pass through? A reply would be helpful mate.
I found your blog on google and read a few of your other posts. I just added you to my Google News Reader. Keep up the good work. Look forward to reading more from you in the future.
Optimization is a key aspect of getting your site noticed. Thanks for the info.
if there are duplicated contents… Before Google will implement penalty, google will look first the site age. In that way,search engine can tell who posted first…
Search engines constantly works towards refining their technology to crawl the web more severely and return progressively more relevant results to the users. The higher a Website ranks in the Search Engines more users will visit the website it means you will get high traffic. In other words you can say that Search Engine ranking.
great tips brother, thanks for share. its very useful.