Remove Duplicate Content from Google's Index - Watch your Rankings Increase!

Profile picture of the author strategic seo services by strategic seo services Posted: 01/09/2013
You can improve your website's ranking in Google by eliminating your site's duplicate content from the Google index.

What counts as duplicate content and how do you know if it's showing in the index?
Well, let's say that you have a blog. You have "Post A", which is totally unique content. However, that "Post A" is also showing up on a "category" page of your site. That category page is placed in Google's supplemental index and is considered to be duplicate content by Google.

How do you check for this?

Do a search like this in Google: site:http://www.yoursite.com

Now, go to the last page of results, if you see a message like this:
"In order to show you the most relevant results, we have omitted some entries very similar to the 42 already displayed.
If you like, you can repeat the search with the omitted results included."
  1. Click on "repeat the search with the omitted results included."
  2. Now, find the pages that are repeats (eg: the Category Page with Post "A"). Get the URLs of those pages and submit content removal requests to Google through Webmasters Tools: Completely remove an entire page - Webmaster Tools Help
  3. Now, the last step: put a disallow command in your robots.txt file so that URL is no longer crawled or indexed by Google.

URLs are usually removed from Google's index within 24 hours.

Use this strategy & witness a great improvement in your site's rankings! This strategy has helped me a great deal. Just thought I'd share it with you.
#content #duplicate #google #increase #index #rankings #remove #watch

  • Profile picture of the author gearmonkey
    gearmonkey
    The problem I am having with one of my guitar blogs is people are stealing the content as fast I put it up. I read an article over at SEOmoz that this is an ongoing problem that Google needs to address ASAP because a lot of people are affected by this in a negative way. Leecher splogs are outranking the original source in some cases.

    This is a good read: Post-Panda, Your Original Content is Being Outranked by Scrapers & Partners | SEOmoz

    Also read the comments.

    Great advice, btw. If you are using Wordpress, I recommend using Robots Meta by Yoast plugin.
  • Profile picture of the author strategic seo services
    strategic seo services
    Originally Posted by gearmonkey View Post
    The problem I am having with one of my guitar blogs is people are stealing the content as fast I put it up. I read an article over at SEOmoz that this is an ongoing problem that Google needs to address ASAP because a lot of people are affected by this in a negative way. Leecher splogs are outranking the original source in some cases.

    This is a good read: Post-Panda, Your Original Content is Being Outranked by Scrapers & Partners | SEOmoz

    Also read the comments.

    Great advice, btw. If you are using Wordpress, I recommend using Robots Meta by Yoast plugin.
    Thanks. I use Robots Meta. It works really well. Highly recommended.
  • Profile picture of the author yukon
    yukon
    Long term it's easier to create a better category.php template page inside your WP theme that doesn't add a full WP Post to the Category page. Instead of adding WP Post on the Category page, create a unique static description (per Category) then simply link out to the internal WP Post in the same category.

    [related thread]
    Any Good Silo Structure Plugins You Know Of?

    No sense in constantly chasing down pages you don't want in the SERPs. There's also no reason to not rank a Category page, it's the mother of all pages related to the subject/category on your site.
  • Profile picture of the author gearmonkey
    gearmonkey
    Originally Posted by yukon View Post
    Long term it's easier to create a better category.php template page inside your WP theme that doesn't add a full WP Post to the Category page. Instead of adding WP Post on the Category page, create a unique static description (per Category) then simply link out to the internal WP Post in the same category.

    [related thread]
    Any Good Silo Structure Plugins You Know Of?

    No sense in constantly chasing down pages you don't want in the SERPs. There's also no reason to not rank a Category page, it's the mother of all pages related to the subject/category on your site.
    Interesting point.

    Do some people find that some wordpress themes rank better than others? For example Genesis vs heatmapthemes, both paid and premium themes.
  • Profile picture of the author MikeFriedman
    MikeFriedman
    So nobody has ever heard of a canonical tag around here?
  • Profile picture of the author yukon
    yukon
    Originally Posted by gearmonkey View Post
    The problem I am having with one of my guitar blogs is people are stealing the content as fast I put it up. I read an article over at SEOmoz that this is an ongoing problem that Google needs to address ASAP because a lot of people are affected by this in a negative way. Leecher splogs are outranking the original source in some cases.

    This is a good read: Post-Panda, Your Original Content is Being Outranked by Scrapers & Partners | SEOmoz

    Also read the comments.

    Great advice, btw. If you are using Wordpress, I recommend using Robots Meta by Yoast plugin.

    No offense to you gearmonkey, but that's exactly the kind of crap that Moz spews out that is totally ridiculous.

    That article is dated April 20th, 2011 & Moz claims Panda is the reason scraped content pages out rank the original source pages content.

    Scraped pages have been outranking original source pages since the beginning of Googles existence, this is far from being anything related to Panda & defiantly was started years before Panda or Moz ever existed.

    There's no possible way Google can have their algo. determine which site created the original content, it's literally impossible. Google can't compare dates on who had the first crawl/index/cache because you might have some small site that doesn't care or know much about SEO that's creating awesome original content, then have SEO savvy scraper sites that know a lot about SEO & how to get their pages indexed ASAP (no real SERP competition).

    Original sources of content means nothing to Googles algo., again it's literally impossible to figure out who created the original content. The only choice If anyone wants to rank pages is to have better SEO than the scraper sites (not optional).
  • Profile picture of the author ucables
    ucables
    Julia, thank you for this very interesting info!

    do you know any tool to check repeated pages automatically?

    when you have thouthands of pages indexed its not easy to check all duplicate pages indexed.
  • Profile picture of the author strategic seo services
    strategic seo services
    Originally Posted by ucables View Post
    Julia, thank you for this very interesting info!

    do you know any tool to check repeated pages automatically?

    when you have thouthands of pages indexed its not easy to check all duplicate pages indexed.
    You're welcome. Unfortunately I don't know of a tool that checks for repeated pages automatically. If I find one, I'll post it here.
  • Profile picture of the author GGpaul
    GGpaul
    Originally Posted by strategic seo services View Post
    You can improve your website's ranking in Google by eliminating your site's duplicate content from the Google index.

    What counts as duplicate content and how do you know if it's showing in the index?
    Well, let's say that you have a blog. You have "Post A", which is totally unique content. However, that "Post A" is also showing up on a "category" page of your site. That category page is placed in Google's supplemental index and is considered to be duplicate content by Google.

    How do you check for this?

    Do a search like this in Google: site:http://www.yoursite.com

    Now, go to the last page of results, if you see a message like this:
    "In order to show you the most relevant results, we have omitted some entries very similar to the 42 already displayed.
    If you like, you can repeat the search with the omitted results included."
    1. Click on "repeat the search with the omitted results included."
    2. Now, find the pages that are repeats (eg: the Category Page with Post "A"). Get the URLs of those pages and submit content removal requests to Google through Webmasters Tools: Completely remove an entire page - Webmaster Tools Help
    3. Now, the last step: put a disallow command in your robots.txt file so that URL is no longer crawled or indexed by Google.

    URLs are usually removed from Google's index within 24 hours.

    Use this strategy & witness a great improvement in your site's rankings! This strategy has helped me a great deal. Just thought I'd share it with you.
    Is there an easier way to do this? This makes it difficult for me when I get 8,000+ results.
  • Profile picture of the author strategic seo services
    strategic seo services
    I found this article that explains a process that seems a bit easier if you have a lot of pages on your site: How to Find Duplicate Content | SEOMention

    It gives a lot of helpful tips too.
  • Profile picture of the author gearmonkey
    gearmonkey
    Originally Posted by ucables View Post
    Julia, thank you for this very interesting info!

    do you know any tool to check repeated pages automatically?

    when you have thouthands of pages indexed its not easy to check all duplicate pages indexed.
    If you're using wordpress, you can use this plugin
    WordPress › Duplicate Post « WordPress Plugins
  • Profile picture of the author satrap
    satrap
    Originally Posted by gearmonkey View Post
    If you're using wordpress, you can use this plugin
    WordPress › Duplicate Post « WordPress Plugins
    This plugin doesn't help you find duplicate posts. It helps you create Clones of your already published posts!...
  • Profile picture of the author gearmonkey
    gearmonkey
    Originally Posted by satrap View Post
    This plugin doesn't help you find duplicate posts. It helps you create Clones of your already published posts!...
    I used one before that would mass find and mass delete duplicate posts. I just did a quick search and that popped up. Plugins are out there for those motivated enough to look.
  • Profile picture of the author strategic seo services
    strategic seo services
    Originally Posted by gearmonkey View Post
    I used one before that would mass find and mass delete duplicate posts. I just did a quick search and that popped up. Plugins are out there for those motivated enough to look.
    Has anyone come across the correct plugin yet?
  • Profile picture of the author Rehmat
    Rehmat
    Originally Posted by yukon View Post
    Long term it's easier to create a better category.php template page inside your WP theme that doesn't add a full WP Post to the Category page. Instead of adding WP Post on the Category page, create a unique static description (per Category) then simply link out to the internal WP Post in the same category.
    Isn't it a better deal to prevent indexing of categories in WordPress using WordPress SEO or any other plugin? Lesser-experienced webmasters can't do it well what you have said.
  • Profile picture of the author strategic seo services
    strategic seo services
    Originally Posted by Rehmat View Post
    Isn't it a better deal to prevent indexing of categories in WordPress using WordPress SEO or any other plugin?
    Yes, prevention of duplicate content arriving in the Google index is a good first step. However, many sites already have duplicate content out there in the index, preventing their rankings from increasing to their full portential. So a plugin/program to detect the duplicate content in the index would really be a big help.
  • Profile picture of the author paulgl
    paulgl
    Originally Posted by MikeFriedman View Post
    So nobody has ever heard of a canonical tag around here?
    That would require working and learning.

    Nobody has heard of much, when you think about it. How about
    ditching WP and get either get a real CMS, your own coding, or
    even blogspot?

    In reality, duplicate content that causes any such penalties is
    rare. Very rare. Threads like this just make people think the sky is
    falling. This is 2013, people. 2013. Are you still next to
    Rip Van Winkle?

    I dare anyone to see how much "duplicate content" the wf,
    amazon, wikipedia, NYT, etc. have.

    Paul
  • Profile picture of the author strategic seo services
    strategic seo services
    Originally Posted by paulgl View Post

    In reality, duplicate content that causes any such penalties is
    rare. Very rare. Threads like this just make people think the sky is
    falling. This is 2013, people. 2013. Are you still next to
    Rip Van Winkle?

    I dare anyone to see how much "duplicate content" the wf,
    amazon, wikipedia, NYT, etc. have.

    Paul
    Really? It's funny that you cited the New York Times (NYT) in your "dare", as Matt Cutts has a video that addresses this very issue: Matt Cutts Addresses Duplicate Content Issue In New Video | WebProNews

    You cannot compare authority sites like the NYT or Wikipedia to an "average" site on the web. Yes, we are in 2013, and the above article was written only 7 months ago.

    My experience has shown me that by removing duplicate content from Google's index, a website's rankings will, in turn, increase within a 24 hour period.
  • Profile picture of the author brettb
    brettb
    I'm pretty sure that duplicate content sank my older sites. In one example I found one of my articles was stolen from my site but the stolen article had an older timestamp in Google!

    Matt Cutts has no answer for this.
  • Profile picture of the author AravGupta
    AravGupta
    The better option to check with the duplicate onpage optimization is to configure your website with webmaster tools, click on HTML improvement section that comes under optimization section and then download the list of errors, fix them manually and resubmit the sitemap of your website. I have done this recently and benefiting a lot.

Related discussions