CLARIFICATION: Duplicate content, Domain Sandboxing, Footprints and the resulting Google Slaps

by 47 replies
58
Many people are asking questions relating to Duplicate Content, Domain Sandboxing, and Footprints. Nevertheless, many answers tend to end with a question mark.

Warrior Forum is "a Great Source" filled with top qualified, experienced Internet Marketers and SEO Experts
Many TOP Warriors here are renouned for their IM qualities and highly respected.

Can we use this opportunity and their expertise to discuss and clarify the exact meaning of the following, in context with Autobloging:

1. Duplicate Content = DC
a) What is DC exactly
b) To what extent can DC be penalized
c) are too similar articles on one site classified as DC (Autoblogs)
d) When is an Article classified as unique, where is the borderline(if it exists at all)
2. Domain Sandboxing with Backlinks= DS
a) What is the meaning of Domain Sandboxing through incorrect backlinking(backlinking to the same domain)
b) What role does DS play when backlinking from an addon-Domain to the main Domain
c) What role does DS play when backlinking from 10 sub-Domains to the same addon-Domainor from one addon-domain to another on the same server
3. Footprint = FP
a) What is a FP
b) when will a FB be recognized by an SE
c) To what extent can a FP be penalized
d) What types of FP are there
e) How can a Newbie recognize and avoid FP
I am sure that clarifying these points will be of enormous help to many existing and upcoming Warriors and it would be good to here some expert opinions

Thanks
Tigerwar
#search engine optimization #clarification #content #domain #duplicate #footprints #google #resulting #sandboxing #slaps
  • Hi Tigerz,

    I'm not qualified to talk about points 2 and 3, but I can talk about Dupe Content.

    Google says:

    "Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar"

    Let's take an article on the BBC website.

    The BBC has a syndication team. They actively seek out other news sources to carry their stories.

    The result? 1000's of sites carrying the same story. Verbatim.

    Duplicate content? Yep. Penalized? Nope.

    Google's smarter than that. But how?

    My guess is it all goes back to their "Page Rank" algorithm.

    Authority...not content...is king.

    A site with authority...maybe by necessity...becomes a legitimate aggregator.

    A site with no authority is simply a scaper.

    It's an over-simplification, but you can best trust those sites whom are trusted by other trusted sites.

    Lol, it's the internet equivalent of an "old boys network"

    So the question should be...

    How do I become an authority?

    Google don't answer that, but I suspect the answer is visitors/reach/PR/Trying hard...

    Cheers,
    Steve
    • [ 3 ] Thanks
    • [2] replies
    • Interesting theory. I'm testing something out with PLR on a new site. The site was indexed a while ago, but had no content. I want to see if it will rank with enough backlinks, regardless of the duplicate content....We'll see what happens.

      Frank
      • [ 1 ] Thanks
      • [1] reply
    • Hi Steve,

      Love your product..WpUnique... It's ranking my sites at no.1 now!

      but to your post.....

      I know of a number of people that state the opposite, here for example:


      Duplicate content:
      "
      I could enter more quotes here but they're pretty repetitious and this simply confirms my first post: most answers end up with a question mark!

      thanks
      Best regards
      Tigerz
  • I agree with Steve, these are all Groundhog Day topics.

    I suggest that you do some research on this yourself with some test sites and observe what happens with news sites, music lyric sites, game cheat sites and the like. You'll discover that 'DC' isn't as bad as some make it out to be but that you may be stacking the deck against yourself ranking-wise if you use it and don't have authority links.

    There is no spoon...

    ...but there are a lot of excuses.

    Well, this one isn't exactly a topic worthy of Ned Ryerson.

    My take is that Google's algorithm searches for patterns. That's how data mining applications work, finding information patterns. Certain patterns trigger alerts for humans to review while others automate an action. These patterns are always changing so it's hard to nail down exactly what might trigger an action. My observation is that Google can bring new action triggers online relatively quickly if they think there's reason to act.

    So, you can't know exactly what's going to fit a pattern since it's a moving target. However, it is safe to assume that certain scripts with a spammy history (BANS, YACG, others) are targeted for deindexing. Certain actions, such as obvious link wheels and massive autogenerated link building across a few sites with little or no authority, will probably get noticed but the action take seems to vary.

    You can avoid problems in this area by using popular CMS systems, such as WordPress or Drupal, a system of your own, exclusive, design, or just plain HTML. Then avoid using obvious plugins and scripts that might show up as a pattern. For linking, avoid well publicized SEO patterns and tricks (link wheels, etc) and stick with naturalized patterns.
    • [ 1 ] Thanks
    • [1] reply
    • Thanks bgmacaw

      The topic may as Steve Crooks said, be scattered all over the forum but I am looking for more than simple answers from peter, pan and paul standing at the next kiosk who have just read a cover story and now believe themselves to be omniscient to the topic. Rather I was hoping for discusion from authority (experienced)warriors who might last but not least be able to erradicate that question mark always left after the answers.

      I am not looking for assumptions or the "bible says....... so do it that way."

      I have personally done a lot of testing over 3 years with DC, yet I'm still not in the position to give a concrete answere; basically, I was never penalized unless doing as Steve Fullman mentioned, entering DC on the same server with two different sites.

      As to your last entry:
      In this fast evolving IM platform, it's near to impossible to monetize a site (with fast results) if we eliminate the possiblities offered through autoblogs.
      If we believed everthing the Doc tells us about which foods are healthy and which are not, we'd probably starve to death.
      By "avoiding", I think you possibly mean't "find the good programs and not just first offer thrown at you"

      I would love know how populated a pattern has to be before being recognized, 1000, 2000, 100,000?
      Programs like wpRobot have been arround for a long time and it doesn't appear that they'll be "marked" in the next decade.

      Tigerwar
      • [1] reply
  • I have a couple of friends with exactly the same article posted in their blog a week apart but both article reached the first page in Google SERPs. SO I don't think every duplicate content get's penalized.
    • [1] reply
    • Hi michael

      maybe as Steve fullman suggested:

      Tigerwar
      • [1] reply
  • I've tested this to death and dupe content or PLR ranks fine with the appropriate amount of backlinks.

    Lee
    • [ 1 ] Thanks
    • [1] reply
    • Hi Lee,

      Very interresting, a picture paints more than a thousand words!

      are you saying, it's worth the risk or there is no risk involved if the site otherwise complies, i.e SEO+strong backlinks?

      Thanks
      Tigerwar
  • I believe all the information your looking for can be found over at the AutoblogBluePrint.com, which gives a very thorough course on new blog building, autoblog building and how to successfully dance through all the sand traps you worry about while giving proper perspective on them.

    Aff Link: Earn Over $3000 Per Month on Autopilot | Auto Blog Blueprint 2.0
    non Aff Link: Earn Over $3000 Per Month on Autopilot | Auto Blog Blueprint 2.0

    also,

    • [1] reply
    • Hello atwellpup,

      I am personaly not looking for help! I am hoping to receive the opinions of some TOP Expert Warriors to enable not only myself but also others to weigh the results against the so called "bible"

      Tigerwar
      • [1] reply
  • I am saying there are no risks involved. Not at this time anyway. Who knows what google will do in the future but I think it is unlikely that they will be penalizing sites that have duplicate content - content syndication is too widespread and a very legitimate practice.
    • [ 1 ] Thanks
  • btw....Don,

    there are truly people looking for quick, correct answers, I don't understand the neccessity of your question above regarding "who are you?" The answere to that was in my question but maybe you did it for this reason......



    The river was wide and swift, and the scorpion stopped to reconsider the situation. He couldn't see any way across. So he ran upriver and then checked downriver, all the while thinking that he might have to turn back.
    Suddenly, he saw a frog sitting in the rushes by the bank of the stream on the other side of the river. He decided to ask the frog for help getting across the stream.
    "Hellooo Mr. Frog!" called the scorpion across the water, "Would you be so kind as to give me a ride on your back across the river?"
    "Well now, Mr. Scorpion! How do I know that if I try to help you, you wont try to kill me?" asked the frog hesitantly.
    "Because," the scorpion replied, "If I try to kill you, then I would die too, for you see I cannot swim!"
    Now this seemed to make sense to the frog. But he asked. "What about when I get close to the bank? You could still try to kill me and get back to the shore!"
    "This is true," agreed the scorpion, "But then I wouldn't be able to get to the other side of the river!"
    "Alright then...how do I know you wont just wait till we get to the other side and THEN kill me?" said the frog.
    "Ahh...," crooned the scorpion, "Because you see, once you've taken me to the other side of this river, I will be so grateful for your help, that it would hardly be fair to reward you with death, now would it?!"
    So the frog agreed to take the scorpion across the river. He swam over to the bank and settled himself near the mud to pick up his passenger. The scorpion crawled onto the frog's back, his sharp claws prickling into the frog's soft hide, and the frog slid into the river. The muddy water swirled around them, but the frog stayed near the surface so the scorpion would not drown. He kicked strongly through the first half of the stream, his flippers paddling wildly against the current.
    Halfway across the river, the frog suddenly felt a sharp sting in his back and, out of the corner of his eye, saw the scorpion remove his stinger from the frog's back. A deadening numbness began to creep into his limbs.
    "You fool!" croaked the frog, "Now we shall both die! Why on earth did you do that?"
    The scorpion shrugged, and did a little jig on the drownings frog's back.
    "I could not help myself. It is my nature." Then they both sank into the muddy waters of the swiftly flowing river.

    Regards
    Tigerwar
    • [1] reply
    • Hi tigerwar,

      My question was meant to be rhetorical and intended for all readers. It would be nice if folks would look for answers already posted before asking for the second or third time in the same day. Or they could just look at the FAQs pinned to the top of this board. If they aren't able to get the answers from there, then post a question. I believe that is considered proper forum etiquette.

      For those that are more visually effected:

      YouTube - Posting and You
      • [1] reply
  • Till now we have had some great replies on the question 1 and 3, Thankyou all for that!

    My Second question regarding getting sandboxed has apparently been misinterpreted due to the discussion about duplicate content.

    b) What role does getting "Sandboxed" play when backlinking from an addon-Domain to the main Domain
    c) What role does "Sandboxed" play when backlinking from 10 sub-Domains to the same addon-Domain or from one addon-domain to another on the same server

    perhaps I it would have been better to say penalized instead of sandboxed

    so once again:

    b) can content, a page or a website be penalized (degraded) when backlinking from an addon-Domain to the main Domain on the same server

    c) can content, a page or a website be penalized (degraded) when backlinking from say, 10 sub-Domains to the same addon-Domain or from one addon-domain to another on the same server


    many thanks
    Tigerwar
  • I autoblog, which was the context of the original post.

    1) I find that some autoblogged sites jump around in Google and Yahoo rankings quite a lot, especially in the early days, but nothing I've ever done using unmodified PLR or republished articles from EzineArticles etc has ever harmed my site in any way that might make me think of a penalty or sandbox. What I would say though is that unoriginal content does not benefit my site with anywhere near the strength that original content does. Whenever I have interrupted my autoblogging to insert some original content, that has (once indexed) been a powerful shot in the arm to my rankings.

    I have ranked sites in at #1, #2 and #3 in Google just with unmodified PLR, especially for long tail keyphrases. I think this had more to do with the domain name being an exact match for the searched keyword phrase though, Google seems to love those especially for microniches. One is still there and I have never touched it - one page of content, Adsense ads, and it has survived a human review without incident. The others have dropped to #5 - #15 and some have dropped even further, but generally I don't seem to be able to do anything to my sites to harm their rankings.

    BUT... having said that, I once tested duplicate content on the same domain, tagging one as canonical as it was also an original piece, and the other page was a straight dupe of it. The original page ranked well and got PR4, whereas the copy never ranks and has no PR, despite the same backlinking strategy for both pages.

    The other thing I tried was using subdomains as Addons to try and take the top ten slots in Google. I soon found that I couldn't get more than 3 indexed pages to rank from the main domain and subdomains combined. Mostly it would max out at 2, sometimes it would go to three but that was the exception not the norm for me.

    When is an article classified as unique? From my tests it is when it is approaching 50% originality of word content AND layout and structure. I have achieved a lot of uniqueness just by breaking up paragraphs, merging others together, mixing in images and video, adding a table of contents and in-page anchors for navigation, and varying the use of bold, italic, underline and font sizes and colours.

    On top of that, the word content becomes original the moment that you start adding your own words into the mix, provided that it makes sense from a grammar and syntax point of view. I tried using some of those unique content plugins that add random ASCII characters etc to fool the search engines, and they do work well, but I got anxious about how a human review might see them one day in the future (they have all survived human review so far - I have called Google to review several of my sites to see if they will spot some of the techniques and so far, nobody has. I think the quality of human reviewers is highly variable though, for reasons that I won't disclose here) so I stopped using them.

    As far as autoblogging goes, if I post articles unedited straight to my sites then Google does not see them as unique. If I mix them together, in a sort of ContentFX style, then they are significantly better and CopyScape and DupeFreePro flag up parts but not the whole thing.

    My best techniques for making word content unique is topping and tailing the original piece with my opinion, and inserting my opinion throughout in parentheses and italics. So far, ranking hasn't been a problem.

    Overall though, I do feel that while autoblogging techniques work well at the moment, they probably won't in the long term. In the long term, high quality content and building an authority site with mixed media is always going to be a strong contender for #1 in the charts.

    2) Sandboxing - I don't think it exists, other than Google will ignore pages on the same domain/subdomain beyond the three most relevant ones for any keyword. Excessive backlinking from your own server to another site on your own server is just ignored by Google as far as I have seen. Google DOES however love it when you deep link to your pages from your main page, but that's more to do with site structuring SEO and passing domain PR faster to subpages. Having loads of links from one of your own subdomains to your main domain is too blunt an instrument and get ignored by Google as being backlink manipulation. Yahoo don't tend to worry about it at all though. Haven't tested Bing enough to tell yet, but early signs are that Bing behaves much more like Google than it does Yahoo. So I don't think there is a sandbox, I think your links just get ignored. I tried black hatting a friend's site once in an experiment, doing all sorts of crazy linking to it to try and get it demoted in SERPS. Couldn't do it.

    3) As for footprints, well everything you do is a footprint. If all your autoblogs are on Wordpress as a CMS, that's a footprint. Yes the search engines can detect a lot of footprints. Do they act on it though? Mostly no, until you give them a reason e.g. you start trying to game Adsense, or you get loads of poor sites to #1 etc. How do they penalise you if they detect it? I've had a site thrown out of the top 1000 because I was using a linkwheel that carried loads of porn and viagra spam. I had an Adsense account banned for arbitrage, I just didn't realise back then how easy it was to detect those types of footprint.

    I think if newbies concentrate on creating good content that users will enjoy, and ignore 'magic bullet' stuff that IM'ers seem obsessed with, then they will avoid footprints and so won't have to worry about them. Only the guys and dolls trying to take shortcuts and avoid doing the work to create quality content, need fear the footprint.

    I am really interested in discussing what kinds of footprint there are in use out there, and how effective / undetectable people find them. e.g. Hamiltonian paths.

    Hope that helps and was the kind of input you were looking for.
    • [ 2 ] Thanks
    • [1] reply
    • H i william,
      Thanks very much for that, the majority basically confirms Don's explanations and you showed everyone this in a detailed breakdown of your actual experience.

      Your right and this is also what I teach. There's no point whatsoever in Newbies trying to run before they can walk, but you must admit, the way the modern IM is being characterized and shaped by many JV prelaunches, it's not easy to convince newbies to learn the basics first when the smell of gold is pushed right under their nose and are told that they can have the same by automating the process.
      Especially when they see that it works for intermediates and advanced marketers!

      That's much appreciated
      Best regards
      Tigerwar
  • In my all honesty... this has been the most helpful thread I've ever encountered in Warrior Forums. The kind of discussion that usually only found on the paid forum / membership.

    Thanks to tigerwar!!

    I'd even appreciate it more if moderators can make this thread sticky. We will have a 'one have them all' thread about these sort of questions which I believe are being asked almost everyday here.

    Just my 2 cents though.. can't deny the benefit of this thread though.. I'm not the kind of guy who appreciate looking through the Search function answers for more than 3 hours just to look for experts comments on duplicate content or sandboxing.


    Ryan
    • [ 1 ] Thanks
  • Hi Daniel,
    Thank's for your "constructive" contribution to the thread. I have already done this but much to my amazement the dupe content wasn't disguarded as I thought it would be...I used different keywords (with top SEO) in exactly the same text. Both keywords were listed at the top of google,so E.G. :

    Site 1 was listed at the top for SEO keyword "tree" but not for keyword "plant"

    Site 2 was listed at the top for SEO keyword "plant" but not for keyword "tree"

    Site 1 was in a folder "----.com/tree/"
    Site 2 was in a sub-domain "plant.----.com"

    Other than what I mentioned in the above posts, I haven't tried much more than that. So appears (at the moment) that the SEO keyword relevancy is a main factor to consider and not necessarily the dupe content(at least on a minimal basis like this example), I'll have to do some more testing with about 10 identical articles on the same basis. The outcome should be quite interesting.

    Has anyone tried this on a larger scale?

    Best regards
    Tigerwar
  • I've personally never had a problem with duplicate content affecting the ranking of any of my pages. I've also never had any of my sites "sandboxed".
    • [ 1 ] Thanks
    • [1] reply
    • Hi Steven,

      if you don't mind me asking,to what exetent do you use dupe content on your site?


      thanks
      tigerwar

Next Topics on Trending Feed