Remove Duplicate Content from Google's Index - Watch your Rankings Increase!

37 replies
  • SEO
  • |
You can improve your website's ranking in Google by eliminating your site's duplicate content from the Google index.

What counts as duplicate content and how do you know if it's showing in the index?
Well, let's say that you have a blog. You have "Post A", which is totally unique content. However, that "Post A" is also showing up on a "category" page of your site. That category page is placed in Google's supplemental index and is considered to be duplicate content by Google.

How do you check for this?

Do a search like this in Google: site:http://www.yoursite.com

Now, go to the last page of results, if you see a message like this:
"In order to show you the most relevant results, we have omitted some entries very similar to the 42 already displayed.
If you like, you can repeat the search with the omitted results included."
  1. Click on "repeat the search with the omitted results included."
  2. Now, find the pages that are repeats (eg: the Category Page with Post "A"). Get the URLs of those pages and submit content removal requests to Google through Webmasters Tools: Completely remove an entire page - Webmaster Tools Help
  3. Now, the last step: put a disallow command in your robots.txt file so that URL is no longer crawled or indexed by Google.

URLs are usually removed from Google's index within 24 hours.

Use this strategy & witness a great improvement in your site's rankings! This strategy has helped me a great deal. Just thought I'd share it with you.
#content #duplicate #google #increase #index #rankings #remove #watch
  • Profile picture of the author gearmonkey
    The problem I am having with one of my guitar blogs is people are stealing the content as fast I put it up. I read an article over at SEOmoz that this is an ongoing problem that Google needs to address ASAP because a lot of people are affected by this in a negative way. Leecher splogs are outranking the original source in some cases.

    This is a good read: Post-Panda, Your Original Content is Being Outranked by Scrapers & Partners | SEOmoz

    Also read the comments.

    Great advice, btw. If you are using Wordpress, I recommend using Robots Meta by Yoast plugin.
    Signature

    My Guitar Website | My SEO Blog - Advertising spots available.

    {{ DiscussionBoard.errors[7581038].message }}
    • Originally Posted by gearmonkey View Post

      The problem I am having with one of my guitar blogs is people are stealing the content as fast I put it up. I read an article over at SEOmoz that this is an ongoing problem that Google needs to address ASAP because a lot of people are affected by this in a negative way. Leecher splogs are outranking the original source in some cases.

      This is a good read: Post-Panda, Your Original Content is Being Outranked by Scrapers & Partners | SEOmoz

      Also read the comments.

      Great advice, btw. If you are using Wordpress, I recommend using Robots Meta by Yoast plugin.
      Thanks. I use Robots Meta. It works really well. Highly recommended.
      {{ DiscussionBoard.errors[7581069].message }}
    • Profile picture of the author yukon
      Banned
      Originally Posted by gearmonkey View Post

      The problem I am having with one of my guitar blogs is people are stealing the content as fast I put it up. I read an article over at SEOmoz that this is an ongoing problem that Google needs to address ASAP because a lot of people are affected by this in a negative way. Leecher splogs are outranking the original source in some cases.

      This is a good read: Post-Panda, Your Original Content is Being Outranked by Scrapers & Partners | SEOmoz

      Also read the comments.

      Great advice, btw. If you are using Wordpress, I recommend using Robots Meta by Yoast plugin.

      No offense to you gearmonkey, but that's exactly the kind of crap that Moz spews out that is totally ridiculous.

      That article is dated April 20th, 2011 & Moz claims Panda is the reason scraped content pages out rank the original source pages content.

      Scraped pages have been outranking original source pages since the beginning of Googles existence, this is far from being anything related to Panda & defiantly was started years before Panda or Moz ever existed.

      There's no possible way Google can have their algo. determine which site created the original content, it's literally impossible. Google can't compare dates on who had the first crawl/index/cache because you might have some small site that doesn't care or know much about SEO that's creating awesome original content, then have SEO savvy scraper sites that know a lot about SEO & how to get their pages indexed ASAP (no real SERP competition).

      Original sources of content means nothing to Googles algo., again it's literally impossible to figure out who created the original content. The only choice If anyone wants to rank pages is to have better SEO than the scraper sites (not optional).
      {{ DiscussionBoard.errors[7581360].message }}
  • Profile picture of the author Icematikx
    [DELETED]
    {{ DiscussionBoard.errors[7581087].message }}
  • Profile picture of the author yukon
    Banned
    Long term it's easier to create a better category.php template page inside your WP theme that doesn't add a full WP Post to the Category page. Instead of adding WP Post on the Category page, create a unique static description (per Category) then simply link out to the internal WP Post in the same category.

    [related thread]
    Any Good Silo Structure Plugins You Know Of?

    No sense in constantly chasing down pages you don't want in the SERPs. There's also no reason to not rank a Category page, it's the mother of all pages related to the subject/category on your site.
    {{ DiscussionBoard.errors[7581241].message }}
    • Profile picture of the author gearmonkey
      Originally Posted by yukon View Post

      Long term it's easier to create a better category.php template page inside your WP theme that doesn't add a full WP Post to the Category page. Instead of adding WP Post on the Category page, create a unique static description (per Category) then simply link out to the internal WP Post in the same category.

      [related thread]
      Any Good Silo Structure Plugins You Know Of?

      No sense in constantly chasing down pages you don't want in the SERPs. There's also no reason to not rank a Category page, it's the mother of all pages related to the subject/category on your site.
      Interesting point.

      Do some people find that some wordpress themes rank better than others? For example Genesis vs heatmapthemes, both paid and premium themes.
      Signature

      My Guitar Website | My SEO Blog - Advertising spots available.

      {{ DiscussionBoard.errors[7581297].message }}
    • Profile picture of the author Rehmat
      Originally Posted by yukon View Post

      Long term it's easier to create a better category.php template page inside your WP theme that doesn't add a full WP Post to the Category page. Instead of adding WP Post on the Category page, create a unique static description (per Category) then simply link out to the internal WP Post in the same category.
      Isn't it a better deal to prevent indexing of categories in WordPress using WordPress SEO or any other plugin? Lesser-experienced webmasters can't do it well what you have said.
      {{ DiscussionBoard.errors[7602273].message }}
      • Originally Posted by Rehmat View Post

        Isn't it a better deal to prevent indexing of categories in WordPress using WordPress SEO or any other plugin?
        Yes, prevention of duplicate content arriving in the Google index is a good first step. However, many sites already have duplicate content out there in the index, preventing their rankings from increasing to their full portential. So a plugin/program to detect the duplicate content in the index would really be a big help.
        {{ DiscussionBoard.errors[7602292].message }}
        • Profile picture of the author ProSence
          Originally Posted by strategic seo services View Post

          Yes, prevention of duplicate content arriving in the Google index is a good first step. However, many sites already have duplicate content out there in the index, preventing their rankings from increasing to their full portential. So a plugin/program to detect the duplicate content in the index would really be a big help.
          Can you please suggest any WP plugin to remove these duplicate contents?
          Signature

          three great FREE tools - www.sitebeak.com, www.GAtective.com and www.impersonal.me

          {{ DiscussionBoard.errors[7774188].message }}
      • Profile picture of the author yukon
        Banned
        Originally Posted by Rehmat View Post

        Isn't it a better deal to prevent indexing of categories in WordPress using WordPress SEO or any other plugin? Lesser-experienced webmasters can't do it well what you have said.

        This will create a unique Category page. If you don't know HTML/PHP find someone to make the changes, a decent web developer should be able to read that thread & have it finished in less than 1 hour for an average theme.
        {{ DiscussionBoard.errors[7604959].message }}
  • Profile picture of the author MikeFriedman
    So nobody has ever heard of a canonical tag around here?
    {{ DiscussionBoard.errors[7581346].message }}
    • Profile picture of the author paulgl
      Originally Posted by MikeFriedman View Post

      So nobody has ever heard of a canonical tag around here?
      That would require working and learning.

      Nobody has heard of much, when you think about it. How about
      ditching WP and get either get a real CMS, your own coding, or
      even blogspot?

      In reality, duplicate content that causes any such penalties is
      rare. Very rare. Threads like this just make people think the sky is
      falling. This is 2013, people. 2013. Are you still next to
      Rip Van Winkle?

      I dare anyone to see how much "duplicate content" the wf,
      amazon, wikipedia, NYT, etc. have.

      Paul
      Signature

      If you were disappointed in your results today, lower your standards tomorrow.

      {{ DiscussionBoard.errors[7602321].message }}
      • Originally Posted by paulgl View Post


        In reality, duplicate content that causes any such penalties is
        rare. Very rare. Threads like this just make people think the sky is
        falling. This is 2013, people. 2013. Are you still next to
        Rip Van Winkle?

        I dare anyone to see how much "duplicate content" the wf,
        amazon, wikipedia, NYT, etc. have.

        Paul
        Really? It's funny that you cited the New York Times (NYT) in your "dare", as Matt Cutts has a video that addresses this very issue: Matt Cutts Addresses Duplicate Content Issue In New Video | WebProNews

        You cannot compare authority sites like the NYT or Wikipedia to an "average" site on the web. Yes, we are in 2013, and the above article was written only 7 months ago.

        My experience has shown me that by removing duplicate content from Google's index, a website's rankings will, in turn, increase within a 24 hour period.
        {{ DiscussionBoard.errors[7604350].message }}
        • Profile picture of the author paulgl
          Originally Posted by strategic seo services View Post

          You cannot compare authority sites like the NYT or Wikipedia to an "average" site on the web. Yes, we are in 2013, and the above article was written only 7 months ago.

          My experience has shown me that by removing duplicate content from Google's index, a website's rankings will, in turn, increase within a 24 hour period.
          That's just bogus. Man, you must have had a ton of sites and tons of dupe
          content. That's NOT the "average joe."

          Maybe your business was junk to begin with. Dump the trash, then see.
          That's different than cause and effect about "duplicate content."

          As far as "average joe sites" and authoritative sites, that's exactly
          how this "average joe" wants to do things. Exactly like the big boys.
          If amazon, NYT, ebay, etc. do not give a rat's hat about dupe content,
          neither will I.

          It's about junk, not duplicate content.

          Just more blind leading the blind down a rabbit hole.

          Paul
          Signature

          If you were disappointed in your results today, lower your standards tomorrow.

          {{ DiscussionBoard.errors[7775186].message }}
          • Profile picture of the author yukon
            Banned
            Originally Posted by paulgl View Post

            That's just bogus. Man, you must have had a ton of sites and tons of dupe
            content. That's NOT the "average joe."

            Maybe your business was junk to begin with. Dump the trash, then see.
            That's different than cause and effect about "duplicate content."

            As far as "average joe sites" and authoritative sites, that's exactly
            how this "average joe" wants to do things. Exactly like the big boys.
            If amazon, NYT, ebay, etc. do not give a rat's hat about dupe content,
            neither will I.

            It's about junk, not duplicate content.

            Just more blind leading the blind down a rabbit hole.

            Paul
            That's why so many struggle with SEO, they think big sites do magical things. Sites like Wikipedia don't do anything any other site can't do. The problem is some people are too lazy to dig in & research why big site pages rank. Sure they have authority but the big sites started out with a single web page just like everyone else. Wikipedia for example has awesome on-page SEO (internal links + relevant pages).
            {{ DiscussionBoard.errors[7775774].message }}
  • Profile picture of the author ucables
    Julia, thank you for this very interesting info!

    do you know any tool to check repeated pages automatically?

    when you have thouthands of pages indexed its not easy to check all duplicate pages indexed.
    Signature

    International sellers cut your shipping cost using a warehouse in Europe:
    Warehouse distribution center based in Spain with ecommerce order fulfillment

    {{ DiscussionBoard.errors[7581713].message }}
  • Profile picture of the author GGpaul
    Originally Posted by strategic seo services View Post

    You can improve your website's ranking in Google by eliminating your site's duplicate content from the Google index.

    What counts as duplicate content and how do you know if it's showing in the index?
    Well, let's say that you have a blog. You have "Post A", which is totally unique content. However, that "Post A" is also showing up on a "category" page of your site. That category page is placed in Google's supplemental index and is considered to be duplicate content by Google.

    How do you check for this?

    Do a search like this in Google: site:http://www.yoursite.com

    Now, go to the last page of results, if you see a message like this:
    "In order to show you the most relevant results, we have omitted some entries very similar to the 42 already displayed.
    If you like, you can repeat the search with the omitted results included."
    1. Click on "repeat the search with the omitted results included."
    2. Now, find the pages that are repeats (eg: the Category Page with Post "A"). Get the URLs of those pages and submit content removal requests to Google through Webmasters Tools: Completely remove an entire page - Webmaster Tools Help
    3. Now, the last step: put a disallow command in your robots.txt file so that URL is no longer crawled or indexed by Google.

    URLs are usually removed from Google's index within 24 hours.

    Use this strategy & witness a great improvement in your site's rankings! This strategy has helped me a great deal. Just thought I'd share it with you.
    Is there an easier way to do this? This makes it difficult for me when I get 8,000+ results.
    Signature

    RIP Dad Oct 14 1954 - Mar 14 2015.

    {{ DiscussionBoard.errors[7581849].message }}
  • Profile picture of the author brettb
    I'm pretty sure that duplicate content sank my older sites. In one example I found one of my articles was stolen from my site but the stolen article had an older timestamp in Google!

    Matt Cutts has no answer for this.
    Signature
    ÖŽ FindABlog: Find blogs to comment on, guest posting opportunities and more ÖŽ




    {{ DiscussionBoard.errors[7604676].message }}
  • Profile picture of the author AravGupta
    The better option to check with the duplicate onpage optimization is to configure your website with webmaster tools, click on HTML improvement section that comes under optimization section and then download the list of errors, fix them manually and resubmit the sitemap of your website. I have done this recently and benefiting a lot.
    {{ DiscussionBoard.errors[7604879].message }}
  • Profile picture of the author GeorgR.
    That category page is placed in Google's supplemental index and is considered to be duplicate content by Google.
    This is just a VERY odd way to go about this, let alone that I have doubts to believe it would impact your rankings significantly.

    The CORRECT way to do it is right from the start, on your site. All archive pages, categories, tags, calendar pages etc..etc.. should ALWAYS only have excerpts. There should not be dupes on your site with the same content on your actual post AND category pages, tags or wherever else.

    But even then, I doubt that Google is still so "dumb" and would rank down your site since there are dupes on your sites. This is pretty common and I cannot see this being a major ranking factor.
    Signature
    *** Affiliate Site Quick --> The Fastest & Easiest Way to Make Affiliate Sites!<--
    -> VISIT www.1UP-SEO.com *** <- Internet Marketing, SEO Tips, Reviews & More!! ***
    *** HIGH QUALITY CONTENT CREATION +++ Manual Article Spinning (Thread Here) ***
    Content Creation, Blogging, Articles, Converting Sales Copy, Reviews, Ebooks, Rewrites
    {{ DiscussionBoard.errors[7604939].message }}
  • Profile picture of the author successproducts
    The best way is to ping your article the minute you post it. I've heard this would put the stamp on YOUR article before anybody else.
    {{ DiscussionBoard.errors[7771342].message }}
  • Profile picture of the author beseenontop
    Several people have suggested the best way to deal with this is prevention - configure your blog so it does not create the duplicate content pages in the first place.

    Do you use Wordpress? This webinar from SEOMoz and @nickherinckx does a good job explaining how to prevent these duplicate content issues. Advanced WordPress SEO: Actionable Advice for Ensuring Your WordPress Content is Found - Webinar | SEOmoz. It was only published a few days ago.

    This article - Guide to Bringing Your WordPress Blog Up to Today?s Google Algorithm Standards | Search Engine Journal - was on Search Engine Journal last month. It's an excellent how-to guide with helpful screen shots. It's not quite as comprehensive as the webinar though. For example, it does not speak to the php file modification, the one to ensure only excerpts get posted to archive pages. Nor does it suggest how to customize your category pages so they are unique.

    The two together are gold.
    {{ DiscussionBoard.errors[7771480].message }}
    • Profile picture of the author nik0
      Banned
      Maybe I misunderstand something here but why no one has suggested to noindex the tags/category/author etc. pages?

      Yukon's suggestion to create a unique category page is miles better though and excerpts is also a very good alternative, I think most WP themes are already setup that it creates excerpts isn't it?
      {{ DiscussionBoard.errors[7772305].message }}
      • Profile picture of the author gearmonkey
        Originally Posted by nik0 View Post

        Maybe I misunderstand something here but why no one has suggested to noindex the tags/category/author etc. pages?

        Yukon's suggestion to create a unique category page is miles better though and excerpts is also a very good alternative, I think most WP themes are already setup that it creates excerpts isn't it?
        I really love this plugin - Meta Robots WordPress plugin WordPress plugin
        Signature

        My Guitar Website | My SEO Blog - Advertising spots available.

        {{ DiscussionBoard.errors[7774392].message }}
  • Profile picture of the author Joe Karl
    Originally Posted by strategic seo services View Post

    You can improve your website's ranking in Google by eliminating your site's duplicate content from the Google index.

    What counts as duplicate content and how do you know if it's showing in the index?
    Well, let's say that you have a blog. You have "Post A", which is totally unique content. However, that "Post A" is also showing up on a "category" page of your site. That category page is placed in Google's supplemental index and is considered to be duplicate content by Google.

    How do you check for this?

    Do a search like this in Google: site:http://www.yoursite.com

    Now, go to the last page of results, if you see a message like this:
    "In order to show you the most relevant results, we have omitted some entries very similar to the 42 already displayed.
    If you like, you can repeat the search with the omitted results included."
    1. Click on "repeat the search with the omitted results included."
    2. Now, find the pages that are repeats (eg: the Category Page with Post "A"). Get the URLs of those pages and submit content removal requests to Google through Webmasters Tools: Completely remove an entire page - Webmaster Tools Help
    3. Now, the last step: put a disallow command in your robots.txt file so that URL is no longer crawled or indexed by Google.

    URLs are usually removed from Google's index within 24 hours.

    Use this strategy & witness a great improvement in your site's rankings! This strategy has helped me a great deal. Just thought I'd share it with you.
    HI there,

    I was going to order a package of you and I stumbled across your duplicate content post and was wondering if you can help.


    Now, find the pages that are repeats (eg: the Category Page with Post "A"). Get the URLs of those pages and submit content removal requests to Google through Webmasters Tools: Completely remove an entire page - Webmaster Tools Help

    I cant find no pages that are repeats ? ok i must be doing something right them....
    {{ DiscussionBoard.errors[7772446].message }}
  • Profile picture of the author Paul Tovey
    Thanks for this, I was wondering how to do this for ages!
    {{ DiscussionBoard.errors[7774642].message }}
  • Profile picture of the author seoed
    My experience has shown me that by removing duplicate content from Google's index, a website's rankings will, in turn, increase within a 24 hour period.
    how much did your rankings or traffic increase?
    Signature
    {{ DiscussionBoard.errors[7775135].message }}
  • Profile picture of the author smodha
    Pro tip: - if you are worried about content from your sites being stolen then head over to Tynt. Signup and they will give a script that you can run on your wordpress/blogger/tumblr blog etc. As soon as somebody "copy & pastes" content from you, your site will be rewarded with a backlink from the their site.

    This is a paid service but the free account will give you access to a 30 day analystics report about your site. It's nice to earn some backlinks on auto-pilot and in my tests it's pretty effective.

    Note:- I am not affiliated to tynt.com in anyway. There are no affiliate links in this post.
    Signature
    I Sell What People Want. The Money Is A Bonus..
    {{ DiscussionBoard.errors[7796862].message }}
  • Profile picture of the author rossnmia
    Definitely true - I had all my wordpress tag pages showing as duplicate pages. Took those off and viola rank improved
    Signature

    Adwords and conversion optimisation consultant and Author of Convert!: How To Turn Interest Into Sales on Amazon

    {{ DiscussionBoard.errors[7878138].message }}
  • Profile picture of the author rossnmia
    By the way google webmaster tools is great for telling you whats duplicate - getting what they consider duplicate pages from the horses mouth so to speak.

    Also duplicate content can be pages with little content e.g. category pages with two posts excerpts on them and lots of surrounding sidebar content and duplicate or no meta tags for those pages = big slap from google in my experience
    Signature

    Adwords and conversion optimisation consultant and Author of Convert!: How To Turn Interest Into Sales on Amazon

    {{ DiscussionBoard.errors[7878150].message }}
  • Profile picture of the author IMSince2003
    Sigh. Do you really think that Google doesn't know what WordPress is and how it works? There are a zillion sites that use WP and I can say fairly certainly that Google is not penalizing them just because they use a standard WP installation.
    {{ DiscussionBoard.errors[7878211].message }}

Trending Topics