Content scraped and ranks lost, what is the logical inference?

13 replies
  • SEO
  • |
Simple question. Site loses ranks by about 30 positions. Around the same time, scrapers rank ahead when searching for "a full sentence within the content".

What is the most logical conclusion?

A. The site lost ranks because the content was scraped by hundreds of sites and ranks may restore if I rewrite content? OR
B. The site was penalized for some reason and as a result it doesn't rank first for its own content. I have to first identify the reason and try to get rid of the penalty for it to rank for its own content?

Background:
  • The loss of ranks did not coincide with any known algorithm update or Penguin/Panda/Pigeon/Doorway update.
  • There is no manual penalty warning
  • No hacked content
  • Still ranks 1st for its own brand name
  • Did not buy any blackhat links, PBN, text links, blogroll links, etc
Please point me in the right direction.
#content #inference #logical #lost #ranks #scraped
  • Profile picture of the author purush245
    Do I take it that there is no clear answer for this kind of a situation?
    {{ DiscussionBoard.errors[10015202].message }}
  • Profile picture of the author fumla
    What did you do to protect your content? It has been seen many times before, sometimes someone just take republish your content and rank higher than you do.. that may be the authority of that site that do the magic.
    Signature

    Ruth is cool 22216

    {{ DiscussionBoard.errors[10015349].message }}
    • Profile picture of the author purush245
      Originally Posted by fumla View Post

      What did you do to protect your content?
      Nothing really. I am not sure if you can really protect your content from scrapers other than keep rewriting your content.

      The scrapers do not have a better authority than my site. But it is done by someone that injects pages into regular sites with scraped content and redirect to their money site. There are thousands of them.

      But I am not sure if scrapers are ranking above because Google thinks I have duplicate content or my site was slapped a penalty of some sort that it doesn't rank for its own content. Is it usual to see a penalized site not to rank for its own content?
      {{ DiscussionBoard.errors[10015403].message }}
    • Profile picture of the author SEO-Dave
      Originally Posted by fumla View Post

      What did you do to protect your content? It has been seen many times before, sometimes someone just take republish your content and rank higher than you do.. that may be the authority of that site that do the magic.
      What can you do (that's affective) to protect content from automated scrapers?

      All you can do is make it harder, not having a full RSS feed for example stops a lot of full content copying.

      If scrapers are getting your content without an RSS feed regularly changing the HTML layout can mess them up (pain in the butt to do though).

      Scrapers will look for elements that are consistent in the content like a H1 header with a particular CSS class which always holds the article title, a div class which starts the main content....

      They use this to determine start/end points for pulling the content out the template code (sidebar links, footers etc...) so all they get is the content.

      Modifying the HTML code means the scrapers current rules break. Article directories do this occasionally and you'll find autoblog scripts have to be updated so their users can still scrape the directories.

      For your average webmasters it's going to be the RSS feed that's being scraped, don't have a full RSS feed so it's harder to scrape automatically.

      David
      {{ DiscussionBoard.errors[10015451].message }}
  • Profile picture of the author SEO-Dave
    Originally Posted by purush245 View Post

    Simple question. Site loses ranks by about 30 positions. Around the same time, scrapers rank ahead when searching for "a full sentence within the content".
    That's one of the first things I check to see if a domain holds a penalty, If your domain isn't ranking for relatively long unique sentences you might have a penalty.

    Search both with speech marks around the sentence and the loose search. The speech mark version will limit the results to your site and those that are scraping your content. I'm concerned when I find original content isn't ranked number 1 for the speech mark searches.

    Try sentences high in the content (will be scraped more) and much lower (scraped less). If you have an RSS feed on your site don't set it to load the entire content, you'll have entire articles scraped rather than snippets.

    Looking at the quality of the sites using your content can indicate if it's a penalty or a case of the scraped content being on domains that have more going for them SEO wise. If Wikipedia scraped your content for example it's probably going to rank above your original domain, so if you find OK quality sites (sites with a strong backlink profile) that are scraping your content are ranked higher rather than really low quality scraper sites**, could be your domain completely lacks backlinks and can't compete against a domain with some off-site SEO.

    ** Scraper sites tend not to have good off-site SEO, they tend to be built in bulk and left to run on autopilot with minimal link building because long term they tend to loose rankings. Build an autoblog, add some basic backlinks, leave to run making money until it's not worth maintaining any more, rinse repeat....

    Remember Google doesn't know for certain which is original source, so tends to rank copied content based on other SEO factors. The SEO assumption is overtime poor quality scraper sites are slowly weeded out due to lack of backlinks compared to original source: it's far from a perfect algorithm.

    David
    {{ DiscussionBoard.errors[10015405].message }}
    • Profile picture of the author purush245
      SEO-Dave, thanks for the elaborate reply. My site has better backlink profile than those scrapers. It has 25 .edu links that are a result of running a scholarship program for students. The content was not scraped for a long time, so Google will know which site has created content first. Suddenly thousands of sites showed up while searching for "a sentence from my content" which even included our brand name in it. This we noticed after the ranks tanked. Perhaps the only way to know if it is penalty first or scraping first that led ranks tanking is to rewrite the content?
      {{ DiscussionBoard.errors[10015419].message }}
  • Profile picture of the author SEO-Dave
    Does sound like a penalty. I very much doubt modifying the content just to make it unique again would help.

    Have you confirmed nothing is broken?

    Check the site is fully indexed, do this Google search

    site:http://domain.tld

    This will show everything indexed under your domain.

    Check your robots.txt file for issues.

    If a WordPress user running one of the SEO plugins check you aren't inadvertently wrecking your on-sitee SEO with nonidex/nofollow options.

    Check if the drop coincided with a Google update, you could be the baby out with the bath water as Google rolled out some anti-spam measures.

    If you haven't done anything blackhat look at any site changes over the past 3 months for mistakes.

    David
    {{ DiscussionBoard.errors[10015436].message }}
    • Profile picture of the author purush245
      David, good points. Thank you. As I mentioned the drop didn't coincide with any known update. It happened on the 15th of March last. Did a thorough site-health checkup and there is nothing really that shows up as suspicious. All the pages indexed, no hacked content. No paid, blackhat links. The only thing that might come as close to being spammy are guest blog links. But anchor text are NOT keyword-rich and the guest posts are carefully selected from blogs that have good traffic and enjoy ranks in Google.

      But within my limited experience, I have not come across an instance when a site is slapped a penalty (Panda/Penguin/Manual etc), the site doesn't rank for a quoted sentence out of the content, unless of course there are heavy-weight scrapers. But I might be mistaken.
      {{ DiscussionBoard.errors[10015460].message }}
  • Profile picture of the author patadeperro
    Better backlink quality, it may be that the domain that is ranking higher than your have a lot of authority/quality backlinks pointing towards it, and then by the site architectures some of that link juice is sent to inner pages furthermore ranking higher than you.
    {{ DiscussionBoard.errors[10016122].message }}
    • Profile picture of the author purush245
      domain that is ranking higher than your have a lot of authority/quality backlinks pointing towards it
      But in my case as it happens none of the scrapers have better link quality than I, which made me worry if scrapers are ranking ahead because my site is under some kind of a penalty.
      {{ DiscussionBoard.errors[10017509].message }}
  • Profile picture of the author yukon
    Banned
    Originally Posted by purush245 View Post

    Simple question. Site loses ranks by about 30 positions. Around the same time, scrapers rank ahead when searching for "a full sentence within the content".
    Fortunately your the only person on earth searching the full sentence in double quotes.

    Sure you can use that technique to find scraper sites but the actual sentence/keyword is useless, traffic isn't searching for a sentence.
    {{ DiscussionBoard.errors[10016133].message }}
    • Profile picture of the author SEO-Dave
      Originally Posted by yukon View Post

      Fortunately your the only person on earth searching the full sentence in double quotes.

      Sure you can use that technique to find scraper sites but the actual sentence/keyword is useless, traffic isn't searching for a sentence.
      Yukon, the search technique isn't to check SERPs with traffic it's to check the health of a site.

      If a site doesn't rank for highly specific SERPs which beyond scrapers is unique content no one is targeting there's probably a problem.

      When this webpage is reindexed a Google search for:

      "Yukon, the search technique isn't to check SERPs with traffic it's to check the health of a site."

      Should always show this domain ranked number 1, if Warrior Forum was ever penalized by Google you might find scraped versions of this content ranking above this page.

      And from an SEO perspective if a site can't rank for these types of SERPs, what chance do they have for even long tail SERPs with a tiny amount of traffic? Which is why the OP is concerned at their traffic drop and lack of these easy long sentence SERPs strongly suggests a penalty.

      David
      {{ DiscussionBoard.errors[10016915].message }}
      • Profile picture of the author purush245
        David, thank you so much for the wonderful explanation. I never in my dream imagined my searching for a sentence in quotes would be interpreted as an effort to get traffic!
        {{ DiscussionBoard.errors[10017506].message }}

Trending Topics