SEO Tips – How to Avoid Duplicate Pages?
Una delle Penalità che Google può dare ad un sito è basata sulla eventuale presenza di Pagine Duplicate.
Cosa fare nel caso in cui un sito abbia pagine duplicate ovvero pagine che differiscono solo per l'url ma che presentano lo stesso contenuto?
Ecco una lista di consigli su come evitare tale problema e quindi migliorare l'indicizzazione delle pagine di un sito su Google.
One of the Penalties that Google can give to a site is based on the eventual presence of Duplicate Pages.
What to do if a website has duplicate pages or pages that differ only in the url but with the same content?
If you search your websites or other websites on google you can find different pageg with the same content. The difference it's just the url, for example:
yoursite.com www.yoursite.com yoursite.com/index.htm
or wiki pages...
We must avoid this situation because Google takes his time to check this duplicate pages and not for new pages or new contents on our website.
Here is a list of Tips on how to avoid this problem and thus improve the indexing of pages of a site to Google.
(click on photos for large version)
Use Google WebMaster Tools
If you do not already know them, are tools provided by Google to help webmasters in the process of indexing their websites.
After the login you can select your website from the list and manage it.
In the Side Menu there's the Diagnostic submenu, this provides you suggestions and warning about you website. For example, Duplicate Meta Descriptions and Duplicate Title Tags. The Diagnostic Tool provides you the url with the duplicate tags so you can check it and solve the problem.
Delete the Duplicate Pages from the XML Sitemap
Check your Sitemap and if there's the URL of a Duplicate Page, delete it and be sure that only the original versione of the page remains on the XML Sitemap!
Do not do this
<url> <loc>http://www.yourname.it/</loc> <priority>1.00</priority> <lastmod>2009-12-27T14:22:55+00:00</lastmod> <changefreq>daily</changefreq> </url> <url> <loc>http://www.yourname.it/index.html</loc> <priority>0.80</priority> <lastmod>2009-03-11T14:22:55+00:00</lastmod> <changefreq>daily</changefreq> </url>...
and do this
<url> <loc>http://www.yourname.it/</loc> <priority>1.00</priority> <lastmod>2009-12-27T14:22:55+00:00</lastmod> <changefreq>daily</changefreq> </url> <url> <loc>http://www.yourname.it/category1.html</loc> <priority>0.80</priority> <lastmod>2009-03-11T14:22:55+00:00</lastmod> <changefreq>daily</changefreq> </url>...
The Error of the Index Page
In a webpage we have the Main menu with the link for the home page, the Logo and the Heading element H1. These have the link for the home page; now remember that you can see the home page in different ways:
The Tip it's don't do this:<a href="index.html" title"your title" name="your-title">
but do this<a href="/" title"your title" name="your-title">
...and also in this case, remove index.html (index.php, or index.asp, etc...) from your XML Sitemap and insert only www.yoursite.com
HTTP 301 Status Code
If you have one or more pages with the same contents you can choose to redirect them to the original page using the 301 Status code, a simple and useful redirect.
The message that receives Googlebot it's "Moved Permanently", so it will visit only the original page.
You can use it in different ways:
Redirection with META Refresh (the easiest way!)<META HTTP-EQUIV="REFRESH" CONTENT="0; URL=http://www.website-name.com/original-page.html">
.com/'; </script> </head> <body> This page has moved to <a href="http://www.website-name
.com/</a> </body> </html>
HTTP 301 Redirect in PHP<?php
// Permanent redirection header("HTTP/1.1 301 Moved Permanently"); header("Location: http://www.website-name
.com/"); exit(); ?>
For the complete list of the Permanent Redirect Methods (perl, cold fusion, asp, etc...) >Permanent Redirect with HTTP 301
...but the 301 Redirect it's not the only possible redirect!
The Canonical Page
If you have duplicate pages you can choose to add in the head section this code:<link rel="canonical" href="http://www.website-name.com/original-page.htm"/>
You can use it for relative or absolut links. The canonical link it's a suggestion for Googlebot, not a directive.
Remember: use it only if you can't delete the duplicate pages or the content of these. The content of the pages must be identical!
Duplicate Pages and Robots.txt
An other simple tip to solve the problem of the duplicate pages! In robots.txt you can choose the directories that Googlebot does not follow.
How do this? It's simple!User-Agent: * Disallow: /directory/subdirectory/ Disallow: /directory/file.html Allow: /
In this way Googlebot doe's NOT follow /directory/subdirectory/ and /directory/file.html but follows the others. With Google Webmaster Tools you can automatically generate your robots.txt in a few clicks.
For more informations about robots.txt visit the official website: http://www.robotstxt.org/
Comments/Suggestions are welcome!
Follow us on Twitter for Extra-News and Resources!