Duplicate Content

Duplicate content—canonical tags and other fun.

I will describe how Google Panda penalized sites with duplicate content. Unfortunately, many site content management systems will sometimes automatically create multiple versions of one page.

For example, let’s say your site has a product page on socket wrenches, but because of the system your site is built on, the exact same page can be accessed from multiple URLs from different areas of your site:

http://www.yoursite.com/products.aspx?=23213

http://www.yoursite.com/socket-wrenches

http://www.yoursite.com/tool-kits/socket-wrenches

In the search engine’s eyes this is confusing as hell and multiple versions of the page are considered duplicate content.

To account for this, you should always ensure a special tag is placed on every page in your site, called the ‘rel canonical’ tag.

The rel canonical tag indicates the original version of a web page to search engines. By telling Google the page you consider to be the ‘true’ version of the page into the tag, you can indicate which page you want listed in the search results.

Choose the URL providing the most sense to users and the best SEO benefit, this should usually be the URL that reads like plain English.

Using the earlier socket wrenches example, with the tag below, Google would be more likely to display the best version of the page in the search engine results.

<link rel=”canonical

” href=”http://www.yoursite.com/socket-wrenches

“/>

As a general rule, include this tag on every page on your site, shortly before the </head> tag in the code.

duplicate content
duplicate content