The Death of Spintax: Spun Text Detection Algorithm

89 replies
  • SEO
  • |
I was at the pub with a fellow SEO who couldn't believe that Google can detect spun text. As a programmer, it's obvious they can. So this post is to settle the argument once and for all...

Although we can still get away with it for now, I am absolutely convinced there are very clever PhDs working on a spintax detection algorithm right now. I predict we have another year or two of getting away with spun content, before you'll have to replace it all with sentence-level super-spun content. The only reason we don't have it already is processing power, and I'm sure their brightest minds are already tuning the algorithm.

Pseudocode for a spintax detection algorithm
  1. Optional: Identify a short-list of candidates by scanning for posts with grammatical & writing style errors (like MS Word's grammar checker). You can skip this once you have enough processing power.
  2. Use LSI to group the posts into keyword topics posted around the same time.
  3. For each group:
    1. For each post:
      1. Use a tokenizing fuzzy matching algorithm to find any other posts in this group where >75% of the words are the same, in the same order (who spins every 4th word?)
      2. Optional: Confirm by checking whether the words that don't match are synonyms of each other.
  4. Mark all offending posts as belonging to the same blog network.
  5. Deindex the network.

So auto-spinners beware! The end is nigh...

This is my first post here, so if you found it helpful, please rate/thank.
#algorithm #death #detection #spinner #spinning #spintax #spun #spun text #text
  • Profile picture of the author Mike Anthony
    Your premise is accurate as to the future ability of Google to determine garbage content in general but the pattern of spun content has nothing to do in most cases with identifying particular networks. Only the uber lazy network owner (but unfortunately there are lots of those) will spin the same content across the entire network. So in reality only 1 and 2 need be looked at.
    Signature

    {{ DiscussionBoard.errors[6100380].message }}
    • Profile picture of the author harvest316
      Originally Posted by Mike Anthony View Post

      Your premise is accurate as to the future ability of Google to determine garbage content in general but the pattern of spun content has nothing to do in most cases with identifying particular networks. Only the uber lazy network owner will spin the same content across the entire network. Si in reality only 1 and 2 need be looked at.
      I think you've got a point, there Mike.
      {{ DiscussionBoard.errors[6240121].message }}
  • Profile picture of the author dracoboar
    Originally Posted by harvest316 View Post

    I was at the pub with a fellow SEO who couldn't believe that Google can detect spun text. As a programmer, it's obvious they can. So this post is to settle the argument once and for all...

    Although we can still get away with it for now, I am absolutely convinced there are very clever PhDs working on a spintax detection algorithm right now. I predict we have another year or two of getting away with spun content, before you'll have to replace it all with sentence-level super-spun content. The only reason we don't have it already is processing power, and I'm sure their brightest minds are already tuning the algorithm.

    Pseudocode for a spintax detection algorithm
    1. Optional: Identify a short-list of candidates by scanning for posts with grammatical & writing style errors (like MS Word's grammar checker). You can skip this once you have enough processing power.
    2. Use LSI to group the posts into keyword topics posted around the same time.
    3. For each group:
      1. For each post:
        1. Use a tokenizing fuzzy matching algorithm to find any other posts in this group where >75% of the words are the same, in the same order (who spins every 4th word?)
        2. Optional: Confirm by checking whether the words that don't match are synonyms of each other.
    4. Mark all offending posts as belonging to the same blog network.
    5. Deindex the network.

    So auto-spinners beware! The end is nigh...

    This is my first post here, so if you found it helpful, please rate/thank.
    To my understanding google has been able to detect spun content for a lng time.

    IIRC back in the day one of the people selling spin software was publishing his case study (this is when google indented related or duplicate content) and he showed the serp and his stuff was indented.

    I am sorry I dont remember more details but I think they can already see spun content.
    {{ DiscussionBoard.errors[6100443].message }}
  • {{ DiscussionBoard.errors[6101895].message }}
    • Profile picture of the author nicktyler
      Originally Posted by Michael55555 View Post

      I hope spammers all fail.
      Just like so many things in life that are deemed to be unnaceptable. If people want to do it and think it will get them somewhere they will find a way to do it. What needs to happen is for it to become more work than it's worth to spam rather than doing it the 'proper way'.

      I'm not sure spammers can ever fail completly as it is just an arms race.

      spam > detect > new type spam > detect > New type spam > detect etc...
      Signature

      1000's of IT jobs in the UK online now at Dice

      {{ DiscussionBoard.errors[6102503].message }}
    • Profile picture of the author boxoun
      Originally Posted by Michael55555 View Post

      I hope spammers all fail.
      I hope you fail. There's no way in hell you can teach anyone how to become a doctor. btw stop spamming the forums with your pathetic anchor texts.
      {{ DiscussionBoard.errors[6113360].message }}
    • Profile picture of the author 4morereferrals
      Originally Posted by Michael55555 View Post

      I hope spammers all fail.
      Ive always found it humorous that its typically those most vociferously protesting "spammers" ... that have the spammy-est crap filled MFA sites around.

      One mans trash is another mans treasure I guess.
      Signature
      Rank Ascend Network - High PR Links / Guaranteed Rankings Increase
      {{ DiscussionBoard.errors[6260967].message }}
  • Profile picture of the author Bond Girl
    I think that if many of the blog sites can detect spun content that certainly big G can as well and I'm sure even better.
    {{ DiscussionBoard.errors[6102044].message }}
    • Profile picture of the author harvest316
      Originally Posted by Bond Girl View Post

      I think that if many of the blog sites can detect spun content that certainly big G can as well and I'm sure even better.
      Holly, you mention blog sites that can detect spun content. Can you tell me more?
      {{ DiscussionBoard.errors[6240125].message }}
  • Profile picture of the author madison_avenue
    Originally Posted by harvest316 View Post

    I was at the pub with a fellow SEO who couldn't believe that Google can detect spun text. As a programmer, it's obvious they can. So this post is to settle the argument once and for all...

    Although we can still get away with it for now, I am absolutely convinced there are very clever PhDs working on a spintax detection algorithm right now. I predict we have another year or two of getting away with spun content, before you'll have to replace it all with sentence-level super-spun content. The only reason we don't have it already is processing power, and I'm sure their brightest minds are already tuning the algorithm.

    Pseudocode for a spintax detection algorithm
    1. Optional: Identify a short-list of candidates by scanning for posts with grammatical & writing style errors (like MS Word's grammar checker). You can skip this once you have enough processing power.
    2. Use LSI to group the posts into keyword topics posted around the same time.
    3. For each group:
      1. For each post:
        1. Use a tokenizing fuzzy matching algorithm to find any other posts in this group where >75% of the words are the same, in the same order (who spins every 4th word?)
        2. Optional: Confirm by checking whether the words that don't match are synonyms of each other.
    4. Mark all offending posts as belonging to the same blog network.
    5. Deindex the network.

    So auto-spinners beware! The end is nigh...

    This is my first post here, so if you found it helpful, please rate/thank.
    Something like this will 'take out' substantial amount of rewritten 'legitimate' content too. I see countless sitee with rewritten wikipedia content etc.
    {{ DiscussionBoard.errors[6102426].message }}
  • Profile picture of the author plsearch
    content that is spun on the word level is easy to detect, but when you spin using 5 or 6 sentences and use a paragraph structure that is also interchangeable, the article can look 100% unique many times over. Granted, this type of article can take hours and hours to make.

    I will still use high quality spins for layered linking.
    {{ DiscussionBoard.errors[6103152].message }}
    • Profile picture of the author harvest316
      I agree. Sentence+ level superspun text is totally the way to go. But it's gotta be written by a native English speaker.

      I'm now looking for a grammar checker that can take spintax as input, to guarantee that every path through the spintax at least makes basic grammatical sense.
      {{ DiscussionBoard.errors[6236149].message }}
  • Profile picture of the author R-Yeah
    The problem is also to know what is authoritative, who said this is the original article and that is the spun article?
    detecting in the future may be close but it will be if you will put 10 spun articles linking to your blog -> thats easy to know and understand.
    but what if there is a news site reporting the same thing written in different manner who is the original??
    {{ DiscussionBoard.errors[6105026].message }}
    • Profile picture of the author Marketing Fool
      No, they won't ever be able to detect properly spun text.

      Here's two sentences:

      1. She owned a cat.
      2. Mary has always had a cat in her life.

      Do you really think Google will EVER be able to know that those are basically the same sentiment if I use them on different pages of my site?

      No.


      On the other hand, if you're stupid about spinning and you use these two sentences:

      1. She owned a cat.
      2. She owned a feline.

      Then sure, maybe someday Google will be able to look at that and say "SPUN!".

      So it will depend on how stupid you are in your spinning...if done correctly though, there is virtually no possible way for Google to detect spun content.
      Signature
      Learn to CODE at Codemy.com - It's Pretty Awesome!
      {{ DiscussionBoard.errors[6105237].message }}
      • Profile picture of the author dburk
        Hi harvest316,

        I think you are going "down the rabbit hole" with your reasoning.

        Google isn't interested in detecting spun content, they are interested in filtering duplicate content.

        Nearly all content is spun content. It is either spun by the human brain, or spun by software. There is nothing wrong with spun content as long as the content is of high quality, and that it is useful, relevant and used in a responsible fashion.

        For the sake of diversity and purpose of conserving resources, Google will filter identical, or nearly identical results from SERPs and limit the amount of duplicate content that will be indexed. This is not a direction they have been heading in, they have been doing this since their inception. It's only your perception of this possibility that is new.
        {{ DiscussionBoard.errors[6105639].message }}
        • Profile picture of the author Marketing Fool
          Originally Posted by dburk View Post

          Hi harvest316,

          I think you are going "down the rabbit hole" with your reasoning.

          Google isn't interested in detecting spun content, they are interested in filtering duplicate content.

          Nearly all content is spun content. It is either spun by the human brain, or spun by software. There is nothing wrong with spun content as long as the content is of high quality, and that it is useful, relevant and used in a responsible fashion.

          For the sake of diversity and purpose of conserving resources, Google will filter identical, or nearly identical results from SERPs and limit the amount of duplicate content that will be indexed. This is not a direction they have been heading in, they have been doing this since their inception. It's only your perception of this possibility that is new.
          Since when has google filtered duplicate content? I can show you thousands of examples to the contrary...
          Signature
          Learn to CODE at Codemy.com - It's Pretty Awesome!
          {{ DiscussionBoard.errors[6105703].message }}
          • Profile picture of the author dburk
            Originally Posted by Marketing Fool View Post

            Since when has google filtered duplicate content? I can show you thousands of examples to the contrary...
            Since their inception. Some the methods and reasons are discussed in the original whitepaper (backrub) published by Brin & Page, the founders of Google, prior to forming the commercial version of their search engine.

            Google has often publically discussed their filtering methods:

            Official Google Webmaster Central Blog: Deftly dealing with duplicate content
            Official Google Webmaster Central Blog: Demystifying the "duplicate content penalty"
            Duplicate content - Webmaster Tools Help
            Cross-domain URL selection - Webmaster Tools Help



            Official Google Webmaster Central Blog: Raising awareness of cross-domain URL selections
            {{ DiscussionBoard.errors[6106358].message }}
            • Profile picture of the author scottmacair
              Most of the spun content out there is auto spun which I think would be quite easy for Google to detect.

              If you hand spin at sentence level then of course this is more difficult to detect but you could almost write the article from scratch in the time it takes you to spin sentences.

              I gave up spinning a while ago - do you really want to be one of those sad people that pumps out spam content because you can't produce interesting valuable content!?
              {{ DiscussionBoard.errors[6106487].message }}
            • Profile picture of the author Marketing Fool

              Uh-huh....*L* yeah, that's yet another instance of watching Google's mouth flap. If Google says it, it must be true, right? Except....anyone who's ever done even a tiny bit of SEO work knows that it's simply not true, Google doesn't kill duplicate content. Do we really need to start ANOTHER one of these arguments?
              Signature
              Learn to CODE at Codemy.com - It's Pretty Awesome!
              {{ DiscussionBoard.errors[6106943].message }}
              • Profile picture of the author dburk
                Originally Posted by Marketing Fool View Post

                Uh-huh....*L* yeah, that's yet another instance of watching Google's mouth flap. If Google says it, it must be true, right? Except....anyone who's ever done even a tiny bit of SEO work knows that it's simply not true, Google doesn't kill duplicate content. Do we really need to start ANOTHER one of these arguments?
                Hi Marketing Fool,

                I do not believe any one suggested Google will "kill duplicate content", only that they filter it for many search queries and that they limit the number of duplicate documents they will index.

                Just because you do not understand what Google is saying, or doing, does not make it untrue, or inaccurate. I have found their published documents to be quite accurate. They do not always disclose everything, especially sensitive information, but what they do disclose usually proves to be fairly accurate in my opinion. In fact, I think you would be hard pressed to find more than a few inaccuracies in everything they have ever published.
                {{ DiscussionBoard.errors[6107122].message }}
                • Profile picture of the author Marketing Fool
                  Originally Posted by dburk View Post

                  Hi Marketing Fool,

                  I do not believe any one suggested Google will "kill duplicate content", only that they filter it for many search queries and that they limit the number of duplicate documents they will index.

                  Just because you do not understand what Google is saying, or doing, does not make it untrue, or inaccurate. I have found their published documents to be quite accurate. They do not always disclose everything, especially sensitive information, but what they do disclose usually proves to be fairly accurate in my opinion. In fact, I think you would be hard pressed to find more than a few inaccuracies in everything they have ever published.
                  Hi Don,

                  Ok, you say "filter for search queries and limit the number", I say "kill"...do we need to argue semantics? We're both arguing about the same thing, whether or not duplicate content is bad.

                  I'm afraid we're going to have to agree to disagree. I've been doing this successfully for over 15 years. I've owned and exhaustively studied the serp movements of literally thousands and thousands of domain names. And I even developed one of the earlier popular search engine submission software tools back in the day which gave me access to all sorts of data.

                  Based on nothing but my own first hand knowledge, I've determined that Google rarely levels with us these days (it wasn't always like that- but is now), and almost seems to go out of their way to misdirect us. Most of what they say is patently false, and if you follow their advice you will get tromped in the serps. That's just what my tests and experience shows.

                  You can keep drinking their koolaid if you like, but my data shows me differently and I'll continue to follow that and not what some guru or Matt Cutts says.

                  Best of luck to you.
                  Signature
                  Learn to CODE at Codemy.com - It's Pretty Awesome!
                  {{ DiscussionBoard.errors[6111981].message }}
          • Profile picture of the author danparks
            Originally Posted by dburk View Post

            Hi harvest316,

            I think you are going "down the rabbit hole" with your reasoning.

            Google isn't interested in detecting spun content, they are interested in filtering duplicate content.

            Nearly all content is spun content. It is either spun by the human brain, or spun by software. There is nothing wrong with spun content as long as the content is of high quality, and that it is useful, relevant and used in a responsible fashion.

            For the sake of diversity and purpose of conserving resources, Google will filter identical, or nearly identical results from SERPs and limit the amount of duplicate content that will be indexed. This is not a direction they have been heading in, they have been doing this since their inception. It's only your perception of this possibility that is new.

            Originally Posted by Marketing Fool View Post

            Since when has google filtered duplicate content? I can show you thousands of examples to the contrary...
            Yup. Go to an article on a site that you think is spun. Copy the first sentence and paste it in Google. I've done that and seen dozens, hundreds of matches. Not even spun content, just absolute duplicate content. Google didn't filter it (I assume by "filter" you mean exclude duplicate content). Exactly how Google determines, or gives weight to, the "first" version may be a subject for debate, but I certainly don't think it's a real issue that Google "filters" duplicate content. I believe Google's dislike for duplicate content is more at the *site* level - they don't like the same content on multiple pages of one site. Which makes sense. I have a page ranking nicely for a keyword, so I make ten copies of the page and host them on the same site to try to get listed ten times in a Google search for that keyword.
            {{ DiscussionBoard.errors[7823088].message }}
            • Profile picture of the author nik0
              Banned
              Originally Posted by danparks View Post

              Yup. Go to an article on a site that you think is spun. Copy the first sentence and paste it in Google. I've done that and seen dozens, hundreds of matches. Not even spun content, just absolute duplicate content. Google didn't filter it (I assume by "filter" you mean exclude duplicate content). Exactly how Google determines, or gives weight to, the "first" version may be a subject for debate, but I certainly don't think it's a real issue that Google "filters" duplicate content. I believe Google's dislike for duplicate content is more at the *site* level - they don't like the same content on multiple pages of one site. Which makes sense. I have a page ranking nicely for a keyword, so I make ten copies of the page and host them on the same site to try to get listed ten times in a Google search for that keyword.
              Google indexes duplicate content, no surprise there.

              Now try to get that copied article rank, different story, although sure you can but it require more effort.

              One step further, fill your website with 40 of such copied articles and try to rank it, not much of a chance.

              Don't agree? Go create some scraped Amazon affiliate site and let me know how it go's. I'm prepared to pay big money when you pull that off.

              Heck even with scraped Amazon sites and then spun you won't make a chance.
              {{ DiscussionBoard.errors[8580148].message }}
        • Profile picture of the author harvest316
          Originally Posted by dburk View Post

          Nearly all content is spun content. It is either spun by the human brain, or spun by software. There is nothing wrong with spun content as long as the content is of high quality, and that it is useful, relevant and used in a responsible fashion.
          Very interesting point about all content being spun if you think hard enough about it, there's nothing new under the sun.

          I still believe that Google will soon begin to devalue poorly spun content, only because most autospun content is really only one step above spam (which I say while running several autoblogs myself!) and after they wipe out webspam, Google will naturally move up the chain to filter out autospun content.
          {{ DiscussionBoard.errors[6238668].message }}
      • Profile picture of the author Letsurf
        Originally Posted by Marketing Fool View Post

        No, they won't ever be able to detect properly spun text.

        Here's two sentences:

        1. She owned a cat.
        2. Mary has always had a cat in her life.

        Do you really think Google will EVER be able to know that those are basically the same sentiment if I use them on different pages of my site?

        No.


        On the other hand, if you're stupid about spinning and you use these two sentences:

        1. She owned a cat.
        2. She owned a feline.

        Then sure, maybe someday Google will be able to look at that and say "SPUN!".

        So it will depend on how stupid you are in your spinning...if done correctly though, there is virtually no possible way for Google to detect spun content.
        I agree that deep spinning on paragraph, sentence, phrase/word level is the correct method. but... No matter the depth of spin the first sentence would always be the most vulnerable right? Even If you have several variations of the first sentence with spun words it will start to be VERY similar. So if you blast out 100's of articles using the spun content and they all have a similar beginning wouldn't that be easy to detect? Not to mention the title. How many titles do you usually use?
        {{ DiscussionBoard.errors[6245770].message }}
        • Profile picture of the author harvest316
          Originally Posted by Letsurf View Post

          I agree that deep spinning on paragraph, sentence, phrase/word level is the correct method. but... No matter the depth of spin the first sentence would always be the most vulnerable right? Even If you have several variations of the first sentence with spun words it will start to be VERY similar. So if you blast out 100's of articles using the spun content and they all have a similar beginning wouldn't that be easy to detect? Not to mention the title. How many titles do you usually use?
          From a programmer's perspective, I dont see the title & first para being any more important than the rest of the body. So long as you have multiple levels of spin, you're probably okay. My main warning here is about single-level auto-spinning.
          {{ DiscussionBoard.errors[6246090].message }}
        • Profile picture of the author danparks
          Originally Posted by Letsurf View Post

          I agree that deep spinning on paragraph, sentence, phrase/word level is the correct method. but... No matter the depth of spin the first sentence would always be the most vulnerable right? Even If you have several variations of the first sentence with spun words it will start to be VERY similar. So if you blast out 100's of articles using the spun content and they all have a similar beginning wouldn't that be easy to detect? Not to mention the title. How many titles do you usually use?
          Well, if you believe this to be true (and I'm not saying it is or isn't true), then there's a very simple solution. Just make up a completely different sentence to use for the first sentence of each version of a spun article! How hard is that?
          {{ DiscussionBoard.errors[7823102].message }}
      • Profile picture of the author danparks
        Originally Posted by Marketing Fool View Post

        No, they won't ever be able to detect properly spun text.

        Here's two sentences:

        1. She owned a cat.
        2. Mary has always had a cat in her life.

        Do you really think Google will EVER be able to know that those are basically the same sentiment if I use them on different pages of my site?

        No.


        On the other hand, if you're stupid about spinning and you use these two sentences:

        1. She owned a cat.
        2. She owned a feline.

        Then sure, maybe someday Google will be able to look at that and say "SPUN!".

        So it will depend on how stupid you are in your spinning...if done correctly though, there is virtually no possible way for Google to detect spun content.
        Agree completely with this.
        {{ DiscussionBoard.errors[7823039].message }}
      • Profile picture of the author C Adept
        Originally Posted by Marketing Fool View Post

        No, they won't ever be able to detect properly spun text.

        Here's two sentences:

        1. She owned a cat.
        2. Mary has always had a cat in her life.

        Do you really think Google will EVER be able to know that those are basically the same sentiment if I use them on different pages of my site?

        No.


        On the other hand, if you're stupid about spinning and you use these two sentences:

        1. She owned a cat.
        2. She owned a feline.

        Then sure, maybe someday Google will be able to look at that and say "SPUN!".

        So it will depend on how stupid you are in your spinning...if done correctly though, there is virtually no possible way for Google to detect spun content.
        Reviving an old thread.

        Spintax is fairly useless for producing spun content that is meant to be seen as different by computers. For documents meant to be read by humans, grammatically correct spintax can still be useful for marketing copy, like descriptions that will be spread across the web.

        When read by software with a good LSI dictionary, it is trivial to reduce words with synonyms to canonical synonyms. Using your last example, a sentence with synonyms of "cat" can be constructed: "She saw the {cat|feline|kitten|kitty}." Spinning can produce four variations.

        She saw the cat.
        She saw the feline.
        She saw the kitten.
        She saw the kitty.

        Using "cat" as the canonical synonym, all four sentences can be reduced to "She saw the cat." The differences are an illusion. Spun documents can have a large number of different words and appear different at a cursory glance by people but are easily converted to the same document by software.

        To spin documents that will pass muster with both humans an computers, variations should be done at two levels, the word/phrase level and the structural level. Words and phrases can be varied with spintax. This can be seen as low level variation. High level variation can be done with sentences and paragraphs variants where one variant is chosen at random just like spintax chooses a word or phrase choice at random. For sentences this can be done with grammatical rearrangements like this:

        After looking for cars, the chicken walked across the road.
        The chicken walked across the road after looking for cars.

        By adding or taking away information that does not change the meaning of the sentence, more variants can be produced:

        The chicken walked across the road.
        Across the road the chicked walked.
        The chicked was wary of cars, so it cautiously walked across the road.

        So a simple sentence has five variants. All variants are grammatically correct and would pass human review. Adding spintax to the sentences would produce more variation but it would be mostly for the benefit of human readers.

        Paragraph variants offer the opportunity to dramatically alter the structure of a document. While spun sentences using the above technique have a one to one relationship with each other in spun documents, paragraph variants can be used to change the sentence relationships between spun documents. One paragraph variant could have three sentences while another could have five. The order ideas are introduced can often be rearranged without changing a paragraph's meaning. The result of paragraph varations will be a spun documents with a radically different document structure. They will appear this way not just to humans but also to machines.

        Using these concepts I wrote software to do this type of spinning. I started out by creating an XML document specification to store documents with spintax, sentence variants, and paragraph variants. Then I wrote Java tools to spin the documents. Finally on top of the original Java code I built a GUI editor called whirlDOC. I guess I need to figure out how to do a WSO.
        {{ DiscussionBoard.errors[9191656].message }}
  • Profile picture of the author jinx1221
    I believe that they would only look for spun content "matches" against content that is already indexed, not necessarily what is out there.. that would take way too much processing than is really needed for their goal. They want as close as they can get to unique content as possible in their index, of course. But say you posted 100 articles, all spun content, well, of course not all 100 articles are going to be indexed, that is the way it is today.. our goal is really the 'juice', not the indexing anyways. Sure you could try and narrow things down to a network, but really, they dont want to deindex 'your' network, they want to deindex the whole network.. kind of like how cops dont want to bust the users, they want to bust the pushers. Why doesn't anybody think that they could detect spun content before? I mean, everybodys been saying to spin at x% or above uniqueness, at both sentence level and word level, etc, well, that's always been why anyways
    Signature

    The Ultimate Private Network Management,
    Visualization and Automation Tool




    {{ DiscussionBoard.errors[6106494].message }}
  • Profile picture of the author successproducts
    There is an alternative- and it's darn cheap. It's seriously cheap and you can reuse all of your PLR at the same time.
    {{ DiscussionBoard.errors[6107083].message }}
  • Profile picture of the author Nicky Papers
    If you're using WordAI you'll never have to worry about poor quality spintax again. It's the most advanced natural language processing product on the market.

    It's still in private beta right now so it's not available to the general public. Hit up Cardine by sending him a PM and he'll set you up. He developed the tool and is an expert in this field.

    As a Word AI user, it freaking rocks! I've been using it for the past few months and it's been a game changer for me. Love it! : )
    {{ DiscussionBoard.errors[6237341].message }}
  • Profile picture of the author IsaacWendt
    Unless your writing an original post from a study or research then everything else is really just spun content. Even if you are reading an article for research and rewrite by hand with your own character thoughts and idea you still could have a very similar article.

    I guess Google will just use social and other signals to see if the content is quality and then give the spun content penalty, or whatever they are going to do.

    But the longer and longer I do SEO the more and more a try to do white hat as it really is the only long term strategy. But black/gray hat is fun while it lasts, and of course easy! =)
    Signature

    We Provide SEO and Web Design for Small and Local Business.

    {{ DiscussionBoard.errors[6237486].message }}
  • Profile picture of the author harvest316
    Thanks Nicky. I've heard about a similar product (very much in alpha) that generates articles from scratch, doing the research and all, and creates totally readable articles. This is definitely the way forward. Being new here, I can't PM Cardine yet, but I'm definitely interested.

    Die, you auto-spinning tools, die!
    {{ DiscussionBoard.errors[6238701].message }}
    • Profile picture of the author cardine
      Originally Posted by harvest316 View Post

      Thanks Nicky. I've heard about a similar product (very much in alpha) that generates articles from scratch, doing the research and all, and creates totally readable articles. This is definitely the way forward. Being new here, I can't PM Cardine yet, but I'm definitely interested.

      Die, you auto-spinning tools, die!
      Here is some example spintax that WordAi is automatically capable of:
      {If { { you buy|you purchase|you get} my {product|solution}|my {product|solution} is {bought|ordered|obtained} by you}, {I will|I'll} be {very|really} happy|{I will|I'll} be {very|really} happy if { { you buy|you purchase|you get} my {product|solution}|my {product|solution} is {bought|ordered|obtained} by you } }.
      That type of spinning is not really detectable at all by Google, and is readable enough to pass a manual review. It obviously isn't able to do nested spintax every sentence, but I am working on improving that everyday (and WordAi currently very easily passes Copyscape on 99%+ of the spins it does).

      My eventual goal (4-5 months away) is to be able to generate completely unique content from scratch. I haven't quite achieved that yet, but I'm getting closer, and creating a spinner that can generate 100% readable text is the first step for me. I'll be very sure you know all about that when the time comes

      The signup link is here (it's still in beta, but it's getting closer to launch so I don't mind sharing the link publicly).

      Also if you still can't PM me but still have any other questions, feel free to skype me (cardine18) or email me (alex@wordai.com).



      Originally Posted by Letsurf View Post

      I agree that deep spinning on paragraph, sentence, phrase/word level is the correct method. but... No matter the depth of spin the first sentence would always be the most vulnerable right? Even If you have several variations of the first sentence with spun words it will start to be VERY similar. So if you blast out 100's of articles using the spun content and they all have a similar beginning wouldn't that be easy to detect? Not to mention the title. How many titles do you usually use?
      I would not blast out the same article spun 100's of times unless you have very very high quality spintax. There is always a tradeoff between readability and uniqueness, and even with the spintax example I gave above, you can probably use that no more than ~10 times before similarities start to leak out. It is far better to take 100 articles, spin them and use each spintax variation 2-3 times than it is to take one article, spin it and use that spintax 200 times.
      {{ DiscussionBoard.errors[6259185].message }}
      • Profile picture of the author nik0
        Banned
        Originally Posted by cardine View Post

        I would not blast out the same article spun 100's of times unless you have very very high quality spintax. There is always a tradeoff between readability and uniqueness, and even with the spintax example I gave above, you can probably use that no more than ~10 times before similarities start to leak out. It is far better to take 100 articles, spin them and use each spintax variation 2-3 times than it is to take one article, spin it and use that spintax 200 times.
        Exactly, using this more then 10 times already leaves enough traces for Google to detect when they have the processing power.

        If you look at the effort put in to make it all readable it would be easier to just write those 10 sentences, especially when you type 100 words/minute like me and many other computer nerds

        I think producing this tiny spin would take me at least 5-10 minutes and then you have the occasional error with misplacing a spin bracket. Not worth it at all.

        In case you have a tool that is able to handle this flawlessly then it might be worth it.

        All the stories about creating huge massive spun's based on 100+ unique articles. For who is it really useful? Most people already agree that they rather have true unique valuable content on their websites.

        Everyone also seems to agree that mass link spam isn't working anymore so blasting 10.000 variations with automatic software building services is also out of the question.

        There are only few people that I can think of who this might be useful for:

        - massive churn & burners
        - a very specific type of marketeer that I have in mind

        Anyway kudos for you as you already said to take 100 unique articles and use a spun of each max 2-3 times which cuts the costs in half. Finally someone I can talk with
        {{ DiscussionBoard.errors[8580124].message }}
  • Profile picture of the author rkseid
    Article + Spintax + Lazy Author = detectable
    Article + Spintax + Fast Author = some detectable
    Article + Spintax + Real Author = few detectable

    Lazy Author uses spintax as the author.
    Fast Author is spintax's assistant.
    Real Author uses spintax as just a low-end assistant.
    {{ DiscussionBoard.errors[6371030].message }}
    • Profile picture of the author kaytav
      Originally Posted by rkseid View Post

      Article + Spintax + Lazy Author = detectable
      Article + Spintax + Fast Author = some detectable
      Article + Spintax + Real Author = few detectable

      Lazy Author uses spintax as the author.
      Fast Author is spintax's assistant.
      Real Author uses spintax as just a low-end assistant.
      What does that suppose to mean?
      {{ DiscussionBoard.errors[7819063].message }}
  • Profile picture of the author harvest316
    Wow, this thread really took off. Thanks guys!
    {{ DiscussionBoard.errors[7818778].message }}
  • Profile picture of the author Velant
    Originally Posted by harvest316 View Post

    Pseudocode for a spintax detection algorithm
    1. Optional: Identify a short-list of candidates by scanning for posts with grammatical & writing style errors (like MS Word's grammar checker). You can skip this once you have enough processing power.
    2. Use LSI to group the posts into keyword topics posted around the same time.
    3. For each group:
      1. For each post:
        1. Use a tokenizing fuzzy matching algorithm to find any other posts in this group where >75% of the words are the same, in the same order (who spins every 4th word?)
        2. Optional: Confirm by checking whether the words that don't match are synonyms of each other.
    4. Mark all offending posts as belonging to the same blog network.
    5. Deindex the network.
    Smartly spun content (i.e spun at all levels: word, phrase, sentence and paragraph level) will never be detected simply because it's indistinguishable from manual rewrite - and who in their right mind will dear even to think of punishing rewriters???

    The algorithm you descibe is only applicable to the most primitive 1D spinning at word level but even in this case it's highly error-prone.

    {{ DiscussionBoard.errors[7822863].message }}
    • Profile picture of the author nik0
      Banned
      Originally Posted by Velant View Post

      Smartly spun content (i.e spun at all levels: word, phrase, sentence and paragraph level) will never be detected simply because it's indistinguishable from manual rewrite - and who in their right mind will dear even to think of punishing rewriters???

      The algorithm you descibe is only applicable to the most primitive 1D spinning at word level but even in this case it's highly error-prone.

      Depending on your 800% uniqueness blablabla that those spin tools indicate.

      I know someone who does article spinning on paragraph/sentence and phrase/word level spun, it only took me 20 articles to compare to find out that 's spun, despite him using paragraph rotation.

      He spun each paragraph 3 times, each sentence 3 times and phrases/words throughout it all. Let's say you do 6 paragraph, 6 sentence etc. it will probably take me 50 or 100 articles to figure it out. A computer algorithm would be way more effective then me so quit living in your fantasy world.
      {{ DiscussionBoard.errors[8580099].message }}
  • Profile picture of the author DPM70
    and who in their right mind will dear even to think of punishing rewriters???
    GOOGLE?

    Of course, it doesn't matter when you are writing beautifully for your audience.

    Any updates from inside the big goog on what spun crap can be identified? This is a year after the OP put down his tracks. Having never used any of it, I'm feeling hunky-dory.
    Signature
    I don't build in order to have clients. I have clients in order to build. - Ayn Rand
    {{ DiscussionBoard.errors[7822895].message }}
    • Profile picture of the author paulgl
      I spin content all the time. You have to. Why reinvent the wheel?
      It's how you spin it, I presume. I cannot fathom anyone actually
      using any auto-spin garbage and expect to get anywhere.

      Yup, year old thread, dug up. Notice nothing really changed in
      the past year?

      People come up with ideas, those ideas go away. Then they
      come back with new ideas, or try and prop old ones up.

      If you think google cares about good, spun content, and penalizing
      it, take a look at wikipedia....the king of spun, copied, and sometimes
      garbage, content.

      It's not spun, copied, or garbage content. It's how you present it,
      among other things.

      We'll see if anyone below me actually reads the whole thread, and realizes
      how old it is without blindly posting some garbage.

      ROTFLMAO! Garbage content? Has that hurt the WF? Not one bit!

      Paul
      Signature

      If you were disappointed in your results today, lower your standards tomorrow.

      {{ DiscussionBoard.errors[7822923].message }}
      • Profile picture of the author bluecoyotemedia
        Paul

        your right!!!!

        I mean I truly don't believe its necessary to ever actually sit down and write new content when it can be..hmmm.. re-purposed.

        there is a balance between university grade content and normal this is how I speak content.

        so i dont see at what point google can actually penalize. you unless its blatant.. gibberish



        look at wordai.. they spin content probably the same level as I would write

        eddie



        Originally Posted by paulgl View Post

        I spin content all the time. You have to. Why reinvent the wheel?
        It's how you spin it, I presume. I cannot fathom anyone actually
        using any auto-spin garbage and expect to get anywhere.

        Yup, year old thread, dug up. Notice nothing really changed in
        the past year?

        People come up with ideas, those ideas go away. Then they
        come back with new ideas, or try and prop old ones up.

        If you think google cares about good, spun content, and penalizing
        it, take a look at wikipedia....the king of spun, copied, and sometimes
        garbage, content.

        It's not spun, copied, or garbage content. It's how you present it,
        among other things.

        We'll see if anyone below me actually reads the whole thread, and realizes
        how old it is without blindly posting some garbage.

        ROTFLMAO! Garbage content? Has that hurt the WF? Not one bit!

        Paul
        Signature

        Skunkworks: noun. informal.

        A clandestine group operating without any external intervention or oversight. Such groups achieve significant breakthroughs rarely discussed in public because they operate "outside the box".
        https://short-stuff.com/-Mjk0fDExOA==

        {{ DiscussionBoard.errors[7823142].message }}
  • Profile picture of the author Campbell24
    Your money sites should always have unique content anyway.

    I only use spun content for linking with SENuke and have never been penalized.
    Signature
    FREE SEO CONSULTATION/ADVICE (from a 7-figure earner)

    I will answer your SEO questions 100% for free.

    Just ask me whatever you want!
    {{ DiscussionBoard.errors[7823172].message }}
    • Profile picture of the author marievvv
      I think that if you create spun content for your Tier 1 network, you expose your money site to a penalty.

      Perhaps, the penalty will not occur now because Google has not implemented the algorithm yet but Google always update their algorithm and one day some websites will be hit.

      If you create your spun articles now. What do we know about spun articles detection algorithm that will arrive in 5 years time.

      Do you remember where the internet was 5 years ago? see the evolution...

      So, the strategy you follow now will have consequences for the future.
      {{ DiscussionBoard.errors[8580043].message }}
    • Profile picture of the author jackrice
      that is exactly what i do
      Originally Posted by Campbell24 View Post

      Your money sites should always have unique content anyway.

      I only use spun content for linking with SENuke and have never been penalized.
      {{ DiscussionBoard.errors[8580070].message }}
  • Profile picture of the author johnbrown12
    Original contents are the key of Blogging
    Signature
    How to make money from $3000 to $7000 per month:

    http://forms.aweber.com/form/85/530278285.htm

    *JOIN 900+ Warriors*
    {{ DiscussionBoard.errors[8580058].message }}
  • Profile picture of the author nik0
    Banned
    @OP: I totally agree and the way you outlined makes it look dead easy to detect spun content, so why wouldn't Google take action? You provided the answer already, processing power.

    @Others who say that Google has no reason to go after spun content, OF COURSE it has a good reason to go after spun content which is that it provides zero value to the internet so why would Google keep those pages indexed? It won't.

    Besides, spun content is used for artificial link building, if Google wasn't against that they would've never bothered with updates like Penguin.

    For as far as I know I'm the only one on this whole forum (that advertises in the for sales sections here I have to add) that quit using spun content over a year ago already, and everyone thought I was stupid to waste money on expensive content for link building purposes. After all, my service would be less effective cause I would be able to build less links. Well so be it, many take that for granted. I can't way till the day that spun content gets targeted, so much new business would come my way!

    It's even that bad that when clients ask me to recommend someone else on this forum (cause they are in a tough niche for example and need extra back link power at a low price) that I can't recommend anyone as every single provider is doing something against my likings. Even the most popular ones (no wonder they are so popular when they can deliver 50 or 100 links for the same price where they only get a dozen links from me).
    {{ DiscussionBoard.errors[8580059].message }}
  • Profile picture of the author cbpayne
    Microsoft has a grammar detector in Word.
    ...why is it so difficult to believe that Google does not use something similar to detect bad grammar --> easily detect crap spun content.
    {{ DiscussionBoard.errors[8580086].message }}
  • Profile picture of the author keith88
    Originally Posted by harvest316 View Post

    I was at the pub with a fellow SEO who couldn't believe that Google can detect spun text. As a programmer, it's obvious they can. So this post is to settle the argument once and for all...

    Although we can still get away with it for now, I am absolutely convinced there are very clever PhDs working on a spintax detection algorithm right now. I predict we have another year or two of getting away with spun content, before you'll have to replace it all with sentence-level super-spun content. The only reason we don't have it already is processing power, and I'm sure their brightest minds are already tuning the algorithm.

    Pseudocode for a spintax detection algorithm
    1. Optional: Identify a short-list of candidates by scanning for posts with grammatical & writing style errors (like MS Word's grammar checker). You can skip this once you have enough processing power.
    2. Use LSI to group the posts into keyword topics posted around the same time.
    3. For each group:
      1. For each post:
        1. Use a tokenizing fuzzy matching algorithm to find any other posts in this group where >75% of the words are the same, in the same order (who spins every 4th word?)
        2. Optional: Confirm by checking whether the words that don't match are synonyms of each other.
    4. Mark all offending posts as belonging to the same blog network.
    5. Deindex the network.

    So auto-spinners beware! The end is nigh...

    This is my first post here, so if you found it helpful, please rate/thank.
    You know what's funny though???

    No matter how many changes or different things Google has thrown at us, we always seem to have an answer for it.

    Those people working at Google are very sharp. There are some very sharp individuals in IM as well.

    Lets see how many products are released to get around this lol.

    Salute to the IM genii.
    {{ DiscussionBoard.errors[8580145].message }}
    • Profile picture of the author PerformanceMan
      Originally Posted by keith88 View Post

      You know what's funny though???

      Those people working at Google are very sharp. There are some very sharp individuals in IM as well..
      Yeah, THAT is FUNNY
      Signature
      Free Special Report on Mindset - Level Up with Positive Thinking
      {{ DiscussionBoard.errors[8583698].message }}
  • Profile picture of the author ddev
    The Good News: Still doesn't detect spintax in Videos (Google doesn't
    understand the content of Videos)

    Video Marketing = Still Great To Get Google Ranks.
    {{ DiscussionBoard.errors[8581214].message }}
  • Profile picture of the author jinx1221
    We could debate on and on whether they can detect spun content or not.. the fact remains, there are plenty of obviously spun pages still ranking in Google. If they programmed part of the algo to deindex spun pages, as of yet it's barely even touched the majority. As far as risk goes, my opinion is, if you're gonna do it, do it well or not at all.
    Signature

    The Ultimate Private Network Management,
    Visualization and Automation Tool




    {{ DiscussionBoard.errors[8581608].message }}
  • Profile picture of the author FranksToys
    It's all about money. Google is a business.

    There are better approaches for identifying networks, and content analysis on that level isn't one of those ways.

    Google only needs to get the algorithm just right, beyond that there's no need to do massive changes. As long as Adwords is the prominent result, the rest only matters as long as people continue to use Google.
    Signature
    "The highest glory of the American Revolution was this - that it connected, in one indissoluble bond, the principles of civil government with the principles of Christianity."
    {{ DiscussionBoard.errors[8582669].message }}
  • Profile picture of the author attorneydavid
    They can write algorythms that produce symphonies that people think are masterpieces until they found out a computer wrote it. It's only a matter of time before there's something that can produce passable articles.
    Signature

    I've lost 90 pounds(160+ overall) fasting since January 2016 after failing for years on diets that just made me sick and miserable. Check out Prudently.com where I'm writing about fasting and weight loss. Get a Brandable Domain Name at Name Perfection.

    {{ DiscussionBoard.errors[8582737].message }}
  • Profile picture of the author Kevin Maguire
    I think some are also missing this point.

    Just as detection of spun has moved forward. As has automatic creation of content. I have had the privilege of seeing whats coming down the line in terms of assisted content creation. And its mind blowing shit that makes spinning almost laughable.

    So don't sweat just yet.
    {{ DiscussionBoard.errors[8582759].message }}
  • Profile picture of the author harvest316
    Hey Kevin, as a true lover of English, I wouldn't mind seeing this "mind blowing" content creation stuff for myself.

    I'm yet to find any auto-generated content that is truly readable!
    {{ DiscussionBoard.errors[8583686].message }}
  • Profile picture of the author TomerN
    Spintax is not dead, it's just a lot more difficult. There are tools out there that can spin well. Have you heard of WordAi? It works pretty good for me. Not an affiliate at all, just someone who has used it.
    {{ DiscussionBoard.errors[8583802].message }}
  • Profile picture of the author RickCopy
    content spinners are thieves... plain and simple. You flood the internet with stolen material that provides no value to your visitors... and at best the content is barely even readable...all just to steal scraps from real content producers.
    {{ DiscussionBoard.errors[8586100].message }}
    • Profile picture of the author attorneydavid
      Originally Posted by RickCopy View Post

      content spinners are thieves... plain and simple. You flood the internet with stolen material that provides no value to your visitors... and at best the content is barely even readable...all just to steal scraps from real content producers.
      I'm not sure you understand what spinning is. You're thinking content scrapers. Though sometimes the two techniques are combined.
      Signature

      I've lost 90 pounds(160+ overall) fasting since January 2016 after failing for years on diets that just made me sick and miserable. Check out Prudently.com where I'm writing about fasting and weight loss. Get a Brandable Domain Name at Name Perfection.

      {{ DiscussionBoard.errors[8586634].message }}
      • Profile picture of the author RickCopy
        Originally Posted by attorneydavid View Post

        I'm not sure you understand what spinning is. You're thinking content scrapers. Though sometimes the two techniques are combined.
        No I know exactly what it is... its taking someone's original article, putting it into a program and changing it around just enough so that it passes automated plagiarism detection. I also know that by the time the end product is edited enough to even make sense you could have just written the article yourself.

        Its lazy, it's unsustainable as a business model and it genuinely takes away from the credit the original author should receive for actually writing the content in the first place....in a word...its theft.
        {{ DiscussionBoard.errors[8588603].message }}
        • Profile picture of the author nik0
          Banned
          Originally Posted by RickCopy View Post

          No I know exactly what it is... its taking someone's original article, putting it into a program and changing it around just enough so that it passes automated plagiarism detection. I also know that by the time the end product is edited enough to even make sense you could have just written the article yourself.

          Its lazy, it's unsustainable as a business model and it genuinely takes away from the credit the original author should receive for actually writing the content in the first place....in a word...its theft.
          There are plenty of people who first write a couple of articles and then spin it on a massive scale so in that case it's not theft.

          I don't do it anymore but previously I did it in such way for tiered link building.
          {{ DiscussionBoard.errors[8588611].message }}
          • Profile picture of the author RickCopy
            Originally Posted by nik0 View Post

            There are plenty of people who first write a couple of articles and then spin it on a massive scale so in that case it's not theft.

            I don't do it anymore but previously I did it in such way for tiered link building.
            If this is the type of stuff you guys do then more power to you... not my cup of tea. Im a writer...I have pride in what I provide my readers. People working the system to get ahead without actually having to provide any value to the end user bug me...a lot....sorry if I offended anyone.
            {{ DiscussionBoard.errors[8588633].message }}
            • Profile picture of the author nik0
              Banned
              Originally Posted by RickCopy View Post

              If this is the type of stuff you guys do then more power to you... not my cup of tea. Im a writer...I have pride in what I provide my readers. People working the system to get ahead without actually having to provide any value to the end user bug me...a lot....sorry if I offended anyone.
              Maybe you misunderstood it but most people are talking about link building content that hardly anyone ever will read.
              {{ DiscussionBoard.errors[8594938].message }}
              • Profile picture of the author RickCopy
                Originally Posted by nik0 View Post

                Maybe you misunderstood it but most people are talking about link building content that hardly anyone ever will read.
                lol wait a sec I thought you hated spun content?
                {{ DiscussionBoard.errors[8595191].message }}
                • Profile picture of the author nik0
                  Banned
                  Originally Posted by RickCopy View Post

                  lol wait a sec I thought you hated spun content?
                  I don't hate it, my clients hate it so I don't use it anymore.

                  And now with the recent Penguin 2.1 it seems Google figured a way to detect it so I'm glad I took the advice from my clients for granted about a year ago.
                  {{ DiscussionBoard.errors[8595353].message }}
  • Profile picture of the author Make Money Ninja
    Here are my thoughts.

    1) I have been using Wordai Turing mode to produce largely readable highly spun articles.
    2) I have been using them on mini sites, i get the content indexed and ranking fine with a few manual edits.
    3) This allows me to churn out 30 or 50 articles a day if i wish, which is a lot of content and obviously has value.


    Now a few more observations:

    1) Spun (undedited) content on web 2.0 sites is getting harder to index. For whatever reason, the requirements in order to get the content indexed are now way higher.

    2) Several sites where i predominantly received links from spun content (relevant web 2.0 sites, readable spun content) were penalized with the last penguin update.

    Now going forward i cannot, with a clear conscious, reccomend newbies create spun content for the purpose of link building. In fact, tools like SEnuke now may actually be very counter productive.

    In the long term, unless you are editing the content to make sure its unique, readable and dont put 1000 different variations of the spintax up, you are ****ed. Mass spinning stuff, posting on thousands of properties is not a productive tactic anymore and is most definitely on the way out.

    Now i am not saying people cant rank or wont rank over the medium term using such tactics. Some will be able to. But long term, this tactic is doomed, so be warned and start changing your best practices now.
    Signature

    The Ultimate Guide To Link Building

    Get More Links - Generate More Traffic - Make More Money!
    {{ DiscussionBoard.errors[8588954].message }}
    • Profile picture of the author cardine
      Originally Posted by Make Money Ninja View Post

      Now i am not saying people cant rank or wont rank over the medium term using such tactics. Some will be able to. But long term, this tactic is doomed, so be warned and start changing your best practices now.
      As Google gets better at detecting spun content, spinners will get better at spinning. I know that from what I have seen in the past, as well as what I know is being developed for the future. It won't be long before there will be no differences at all in quality between spun versus not spun content.
      {{ DiscussionBoard.errors[8593020].message }}
      • Profile picture of the author RickCopy
        Originally Posted by cardine View Post

        As Google gets better at detecting spun content, spinners will get better at spinning. I know that from what I have seen in the past, as well as what I know is being developed for the future. It won't be long before there will be no differences at all in quality between spun versus not spun content.
        HAHAHAHA

        maybe if you guys dont re-evaluate what you consider "quality".

        Some of the stuff Ive seen people post or link to on here would make my old English teachers have a heart attack.
        {{ DiscussionBoard.errors[8594897].message }}
        • Profile picture of the author cardine
          Originally Posted by RickCopy View Post

          HAHAHAHA

          maybe if you guys dont re-evaluate what you consider "quality".

          Some of the stuff Ive seen people post or link to on here would make my old English teachers have a heart attack.
          Did you know that Forbes has articles that are written 100% by machines now?

          I'm not talking about your average content tool that is being sold for a one time payment of $77 that has a sexy sales pitch but no real artificial intelligence powering it. There are real companies putting real money into R&D for artificial intelligence.

          Google is still very far away from properly detecting whether content is spun (you can just read their most recent patents and see how crude their current techniques are). By the time Google gets any good at detecting whether content is high quality or not, machines will be writing nearly all the high quality content on the web anyways.
          {{ DiscussionBoard.errors[8595209].message }}
          • Profile picture of the author nik0
            Banned
            Originally Posted by cardine View Post

            Did you know that Forbes has articles that are written 100% by machines now?
            Source or it didn't happen.
            {{ DiscussionBoard.errors[8595356].message }}
          • Profile picture of the author RickCopy
            Originally Posted by cardine View Post

            Did you know that Forbes has articles that are written 100% by machines now?

            I'm not talking about your average content tool that is being sold for a one time payment of $77 that has a sexy sales pitch but no real artificial intelligence powering it. There are real companies putting real money into R&D for artificial intelligence.

            Google is still very far away from properly detecting whether content is spun (you can just read their most recent patents and see how crude their current techniques are). By the time Google gets any good at detecting whether content is high quality or not, machines will be writing nearly all the high quality content on the web anyways.
            this is the software you're talking about and its a long way from doing what you think it will...

            30 Clients Using Computer-Generated Stories Instead of Writers - GalleyCat

            and can I get a turn at your crystal ball when you're done with it? lol
            {{ DiscussionBoard.errors[8595367].message }}
            • Profile picture of the author cardine
              Originally Posted by RickCopy View Post

              this is the software you're talking about and its a long way from doing what you think it will...

              30 Clients Using Computer-Generated Stories Instead of Writers - GalleyCat

              and can I get a turn at your crystal ball when you're done with it? lol
              I never said that the technology is completely there right now (although it is already good enough for Forbes, and the article you linked to was from almost 2 years ago). I said that by the time Google gets good enough to detect whether content is automatically written, computers will be able to automatically write content that good.

              Which kind of makes sense when you think about it. The same technology needed to write content that makes sense is required to tell if existing content makes sense.

              I'm an AI developer. It is not a crystal ball, it is the technology that I see being developed around me every day.

              Originally Posted by nik0 View Post

              Source or it didn't happen.
              RickCopy linked to a source above.
              {{ DiscussionBoard.errors[8595538].message }}
              • Profile picture of the author RickCopy
                Originally Posted by cardine View Post

                I'm an AI developer. It is not a crystal ball, it is the technology that I see being developed around me every day.
                I dont question the technology... im sure computers will be able to rewrite crappy content better and better as the years go on....i question your assumption that google wont be able to catch it. Last I heard they arent exactly forthcoming with how their algorithm works or how they detect duplicate content. All we can do is assume... and im assuming that a multi billion dollar giant like google isnt going to let robot writers that crank out soulless content dominate their search engines.

                I think the main area of contention here is that you and I have a very different idea of what constitutes good content.
                {{ DiscussionBoard.errors[8595656].message }}
                • Profile picture of the author cardine
                  Originally Posted by RickCopy View Post

                  I dont question the technology... im sure computers will be able to rewrite crappy content better and better as the years go on....i question your assumption that google wont be able to catch it. Last I heard they arent exactly forthcoming with how their algorithm works or how they detect duplicate content. All we can do is assume... and im assuming that a multi billion dollar giant like google isnt going to let robot writers that crank out soulless content dominate their search engines.

                  I think the main area of contention here is that you and I have a very different idea of what constitutes good content.
                  Google gives people ample opportunity to figure out how they judge content. If they were able to tell whether content is well written or not, they would also be able to generate their own well written content (they could simply randomly pick sentences until they come up with one that their algorithm says is "well written"). They clearly do not have the technology to write good quality content themselves, so you can then infer that they do not have the technology to judge good quality content. Or you can just look at their patents and see their techniques, which are not that impressive.

                  You say Google is a billion dollar company, but they are also analyzing trillions of articles of content. They have to have the computation power to figure out whether every page on the web is well written. A smaller company might not have billions of dollars, but all they have to do is create a few thousand articles that are well written. Google can't spend hours of computation time per article (since their are trillions of them) determining the quality of their content.

                  Right now machines might be creating "soulless" content, but that will very likely not be the case in 5 or 10 years. Just like how 10 years ago computers couldn't write any kind of content.

                  So in short, Google can judge "quality" to some level. But it certainly can't judge whether content is "soulless" or not, and automated programs can very easily write or rewrite "soulless" content. By the time Google could figure out whether content is "soulless" or not, the same technology that would be used to make that distinction could also be used to write content that isn't soulless.
                  {{ DiscussionBoard.errors[8595959].message }}
  • Profile picture of the author yukon
    Banned
    Lmao, spinners (fail).
    {{ DiscussionBoard.errors[8595678].message }}
  • Profile picture of the author TheAdsenseGuy
    Ok, this conversation is amusing.

    Do you all want to know a little secret?

    Go out and pick a long tail keyword to rank. Now go copy and paste a paragraph of content from at least 4 different sources on the web. Put those 4 paragraphs of content on your page. That content will pass through every panda update no problem. And will rank just fine.

    For the past 2 years i've been a black hat autoblogger. No, i don't monitize the autoblog. i use a cloaking redirection plugin to keep google bot on the autoblog but automatically redirect the visitors to a different quality website. (By the way, putting affiliate links on a huge autoblog just gets it penalized right away. Mine are not monitized).

    I can build thousands of pages with this type of content. And they survive and stay ranked through every panda update. I don't build backlinks to them so they survive Penguin updates too. They can send 100-300 visitors per day.

    Do i use spun content? Nope.

    Each page on my autoblog has anywhere from 8-20 snippets of content from 8-20 different sources on the web.

    Now if you just copy and paste an ezinearticle onto your site will it rank? Nope. But if your page content is from multiple sources on the web it will rank just fine (no penalties).

    Don't believe it? Go ahead and try it. You'll be surprised
    {{ DiscussionBoard.errors[8596896].message }}
    • Profile picture of the author nik0
      Banned
      Originally Posted by TheAdsenseGuy View Post

      Ok, this conversation is amusing.

      Do you all want to know a little secret?

      Go out and pick a long tail keyword to rank. Now go copy and paste a paragraph of content from at least 4 different sources on the web. Put those 4 paragraphs of content on your page. That content will pass through every panda update no problem. And will rank just fine.

      For the past 2 years i've been a black hat autoblogger. No, i don't monitize the autoblog. i use a cloaking redirection plugin to keep google bot on the autoblog but automatically redirect the visitors to a different quality website. (By the way, putting affiliate links on a huge autoblog just gets it penalized right away. Mine are not monitized).

      I can build thousands of pages with this type of content. And they survive and stay ranked through every panda update. I don't build backlinks to them so they survive Penguin updates too. They can send 100-300 visitors per day.

      Do i use spun content? Nope.

      Each page on my autoblog has anywhere from 8-20 snippets of content from 8-20 different sources on the web.

      Now if you just copy and paste an ezinearticle onto your site will it rank? Nope. But if your page content is from multiple sources on the web it will rank just fine (no penalties).

      Don't believe it? Go ahead and try it. You'll be surprised
      Although I don't practice these type of things I still have a certain love for it but too busy to figure such things out.

      Any recommended tools and is that plugin for sale somewhere?
      {{ DiscussionBoard.errors[8596936].message }}
  • Profile picture of the author brettb
    When I was searching last year I found plenty of spun content ranking well. So as of last year Google certainly could detect spun content. Either that or they chose not to penalise it too much.

    I guess that it's actually quite hard to detect, as poor English could just mean the writer is a non-English native speaker.
    Signature
    ÖŽ FindABlog: Find blogs to comment on, guest posting opportunities and more ÖŽ




    {{ DiscussionBoard.errors[8597026].message }}
  • Profile picture of the author nik0
    Banned
    Sometimes I think they do it step by step to give people the chance to improve.

    Otherwise half the internet would tank in a day
    {{ DiscussionBoard.errors[8597061].message }}
  • Profile picture of the author jinx1221
    They can't detect spun text. Plain and simple. The real question is, can they detect, or do they score grammar and readability? Yes they can and do. They want the most readable and gramatically correct results in their search engine. That's a no brainer, right? That said, being that a lot if not most 'spun' text is full of grammar errors and practically unreadable, you could say they can detect 'spun' content. That's circumstantial, though. You can write a perfectly readable and perfect grammar spun article. That would pass the grammar and readability score with flying aces.

    It's like asking, "Are police good at catching fast cars"? What's a fast car? My car goes pretty fast. A lamborghini can go faster than mine. 55mph is pretty fast, but not illegal on the highway. It's not the speed ability of the car that's in question. The question is, "Are police good at catching speeding cars"?
    Signature

    The Ultimate Private Network Management,
    Visualization and Automation Tool




    {{ DiscussionBoard.errors[8598452].message }}
  • Profile picture of the author TopLinks
    Read various news stories on the same subject and you will understand what the difference is spinning and spamming.

    Problem is, a spintax file that creates content that can be seen as stories that could be posted to various sites, and are completely unique, takes a long time to create and generates a 10MB+ text file. I've seen, literally, a 3 million line text file. Took the writer a few months, and they're still making money from it. Not my style, but, geeze.
    {{ DiscussionBoard.errors[9192226].message }}
    • Profile picture of the author C Adept
      Originally Posted by TopLinks View Post

      Read various news stories on the same subject and you will understand what the difference is spinning and spamming.

      Problem is, a spintax file that creates content that can be seen as stories that could be posted to various sites, and are completely unique, takes a long time to create and generates a 10MB+ text file. I've seen, literally, a 3 million line text file. Took the writer a few months, and they're still making money from it. Not my style, but, geeze.
      Actually it does not take that long to create using an editor designed for that purpose. It is much easier than writing new, unique content. For each paragraph, create one or more variants with the same point as the original. Vary the number of sentences in each so spun documents will appear structurally different. Break the paragraphs into sentences. For each sentence, create several variations that express the same meaning. This is fairly mindless gruntwork if you have a decent facility with writing. You just put yourself in a chair and grind through the sentences. It does not take that long once you get the hang of it.

      In many cases documents discuss several points but the order of the points is not important. This may occur at the paragraph or sentence level. For more variation find those cases and mark them to be randomly reordered instead of choosing one at random.

      As a final step convert many words to spintax that maintains proper grammar.
      {{ DiscussionBoard.errors[9195646].message }}
  • Profile picture of the author thebert
    Google has no problem detecting poorly spun content. Google will also hammer any rankings for your site that may have resulted from content they deem to have been spun.

    I'm on board with DanParks and MarketingFool. It's entirely possible to create spun content that is undetectable. It's more work, of course, but worth it.

    Anyone thinking of populating a money-site with spun content, no matter how "good", needs to give their head a shake!

    Good luck!
    {{ DiscussionBoard.errors[9193107].message }}
    • Profile picture of the author mantis108
      Originally Posted by thebert View Post

      Google has no problem detecting poorly spun content. Google will also hammer any rankings for your site that may have resulted from content they deem to have been spun.

      I'm on board with DanParks and MarketingFool. It's entirely possible to create spun content that is undetectable. It's more work, of course, but worth it.

      Anyone thinking of populating a money-site with spun content, no matter how "good", needs to give their head a shake!

      Good luck!
      I agree with your first point but challenge your second. There's absolutely no way Google would ever detect the high-quality spintax we create by hand. Just not going to happen, ever. There would be too many false positives and too much collateral damage if they even attempted to do that. You just need to not be lazy and hire smart people to do this.
      Signature

      Local SEO / Google Places Consultant? Make Life Simpler with the Ultimate Local SEO / Google Places Client Data Form

      {{ DiscussionBoard.errors[9732844].message }}
      • Profile picture of the author Kevin Maguire
        Originally Posted by mantis108 View Post

        You just need to not be lazy and hire smart people to do this.
        I guess that discounts you from the possible candidates, being smart enough to bump a 6 month old thread.
        {{ DiscussionBoard.errors[9733196].message }}
  • Profile picture of the author netbread
    Every thing is a spun of the main source anyways, look at the news.
    {{ DiscussionBoard.errors[9266928].message }}
  • Profile picture of the author seoboyz01
    I don't like spun content and I don't use it in any of my work. For me, it's just easier to come up with original content. The human brain is the best content creator of all. And, no software is going to come up with better diversity of content as what a human can do. Even if others are spinning content, I'll just stick to my own version of things.
    Signature
    Google DOMINATION SEO service - Take your site 1st page of Google.
    {{ DiscussionBoard.errors[9733000].message }}

Trending Topics