Weird Revelation about Google Spider Bot!

15 replies
  • SEO
  • |
I don't know if this has already been discovered before or not... but I just came across it by accident, so I thought I should share it here.

Nothing breakthrough though, but of course a bit weird.

Ok, here it is..

After Mr. Panda waves hit the shore, it became increasingly more important to have unique content on our sites, for better rankings. (Yeah every one knows that lol).

So I thought lets see how far does Google go in determining what's "Unique" and what's not.

We've all heard about the research according to which if the first and last alphabets of a word remain same and inner letters gets scrambled, our brains can still pickup the words because it processes the word as a whole.

Well it turns out that it's also true for Google spiders!

As an experiment, here's what I did:

I took an article and used an online word scrambler tool (sorry can't post any links yet because I don't have enough posts yet) to scramble the inner letters of all the words of this article.

Then I compared the two articles using another online tool (my bad, can't post the link again!)

And guess what: It turned out that the two articles are 99% similar!

So apparently, even Google spiders reads individual words, not letters. Just like our brains.

Sneaky isn't it!
#bot #google #revelation #seo #spider #unique content #weird
  • Profile picture of the author yukon
    Banned
    Interesting.

    I imagine they run the same text scan as they run when you search Google SERPs.

    Example:

    1) acr arpts ofr sle will return a suggested correction of car parts for sale

    2) acr arpts ofr sle
    {{ DiscussionBoard.errors[5709350].message }}
  • Profile picture of the author FreeMeal
    Does the tool you used to compare the two articles use the same technique as the google bot to read pages?
    {{ DiscussionBoard.errors[5709375].message }}
  • Profile picture of the author UMS
    That's the 2nd new thing I've learnt today. Must be on a roll.

    Noticed that searching for

    bste ceahp isnrunace

    will return results for

    best cheap insurance

    In fact, even if you drop some of the letters from insurance, eg: snrunce, it still corrects it to insurance.

    Google must have some interesting natural language processing abilities.
    {{ DiscussionBoard.errors[5709503].message }}
    • Profile picture of the author yukon
      Banned
      Originally Posted by UMS View Post

      That's the 2nd new thing I've learnt today. Must be on a roll.

      Noticed that searching for

      bste ceahp isnrunace

      will return results for

      best cheap insurance

      In fact, even if you drop some of the letters from insurance, eg: snrunce, it still corrects it to insurance.

      Google must have some interesting natural language processing abilities.
      I'm sure Google Search gets billions of typos every single day.

      You can pretty much mangle a word & Google will most times solve the typo.
      {{ DiscussionBoard.errors[5709630].message }}
  • Profile picture of the author Osman Safdar
    FreeMeal, Yes indeed, that tool emulates Google spider bots..
    {{ DiscussionBoard.errors[5710129].message }}
    • Profile picture of the author dp40oz
      Originally Posted by Osman Safdar View Post

      FreeMeal, Yes indeed, that tool emulates Google spider bots..
      No tool knows how Google bot reads text expect for Google so this is thread is highly speculative.
      {{ DiscussionBoard.errors[5710306].message }}
      • Profile picture of the author Osman Safdar
        Originally Posted by dp40oz View Post

        No tool knows how Google bot reads text expect for Google so this is thread is highly speculative.
        Yes, but one can come close enough, don't you think? Just Google "google search engine spider simulator" ... its something that has been around for a long time.
        {{ DiscussionBoard.errors[5710449].message }}
        • Profile picture of the author dp40oz
          Originally Posted by Osman Safdar View Post

          Yes, but one can come close enough, don't you think? Just Google "google search engine spider simulator" ... its something that has been around for a long time.
          One can assume how Google's spider reads or "sees" text, but Google's spiders are some of the most advanced bots ever made, backed by an endless bankroll to fine tune and "be smarter" then anything else similar. So to assume a random websites quick little script is at all accurately assessing what Googles spider bot is actually seeing and how its analyzed on Googles end is highly speculative especially in this specific case.
          {{ DiscussionBoard.errors[5710524].message }}
          • Profile picture of the author Osman Safdar
            Originally Posted by dp40oz View Post

            One can assume how Google's spider reads or "sees" text, but Google's spiders are some of the most advanced bots ever made, backed by an endless bankroll to fine tune and "be smarter" then anything else similar. So to assume a random websites quick little script is at all accurately assessing what Googles spider bot is actually seeing and how its analyzed on Googles end is highly speculative especially in this specific case.
            Can't argue on that After all, a simulator is just... a simulator, not the real thing.

            But come to think of it: If a random little script is advanced enough to know that isnrunace is just a misspelled version of Insurance.... then Google's spiders must be REALLY good at picking up those misspells, right?
            {{ DiscussionBoard.errors[5710645].message }}
  • Profile picture of the author Osman Safdar
    UMS, thats good! Learning never stops
    {{ DiscussionBoard.errors[5710168].message }}
  • Profile picture of the author powerofschool
    I have noticed some of the information you have told. But

    you have given some additional information , which was not noticed by me.

    Thanks
    Signature

    Get ready to ace your digital marketing interview with our comprehensive guide to the most commonly asked questions and answers. Upgrade your skills today!

    Digital Marketing Interview Questions and Answers

    {{ DiscussionBoard.errors[5710661].message }}
  • Profile picture of the author Osman Safdar
    But of course, to say that Google considers an article 99% similar (in terms of uniqueness) to the same article with all misspelled words... is just a speculation.
    {{ DiscussionBoard.errors[5710672].message }}
  • Profile picture of the author Osman Safdar
    Now that I can post links, here are the two tools I mentioned in my OP:

    Text Scrambler: Word Scrambler

    Similar Page Checker: Similar Page Checker - Duplicate content checker
    {{ DiscussionBoard.errors[5710685].message }}
  • Profile picture of the author GeorgR.
    Do NOT trust "comparison" tools and sites, they are flakey at best.

    Example:

    Yesterday i made a spun article for a client which FOR SOME REASON showed up in copyscape as 0% or 1% unique - i could note explain this since the article was spun and written in a way like the other articles which (according to dupecop) should at least yield 40% uniqueness.

    I ran the article a few times and dupecop constantly told me "1%" unique!!

    So i changed only 2 (TWO) words around at the beginning of a paragraph and ran the article again, and all of a sudden dupecop told me it's 40% "unique".

    It looks as if dupecop/copyscape (and whatever other checkers) also look at the STRUCTURE of the article (paragraphs, sentences) and their order...and is in fact often ignoring even if almost each word is spun!

    Short: The words (actual content) MIGHT play a rather small part in determining uniqueness....you have to "see" the entire article and also "play" with sentences, sentence lengths, paragraph lengths etc. so it becomes "unique".

    This explains why your scrambling of individual words was not very effective.

    **

    In my own experience, what is very effective is spinning whole phrases (and sentences of course)...resulting in different length sentences/phrases.

    What is BAD is keeping the overall structure of an article and only spinning individual words, this is why i don't like to work with those "tips articles" where have lists in the article like 1. 2. 3. etc. ...or articles with short sub-headlines within the article. Because you can spin as much as you want..and the overall structure of the article will STILL be about the same as the original one if you would keep the "tips" lists in the article and the sub-headlines at the same position.

    *
    As for Google: We do not KNOW exactly how Google determines dupes...it could be on sentence level, it could be on word level and Google could KNOW about synonyms which would make any single level word spins useless, more or less. Google COULD also ignore the actual sentences and just look at the overall "appearance" of an article....or combine all those things together etc. Just don't assume things, my $0.02
    Signature
    *** Affiliate Site Quick --> The Fastest & Easiest Way to Make Affiliate Sites!<--
    -> VISIT www.1UP-SEO.com *** <- Internet Marketing, SEO Tips, Reviews & More!! ***
    *** HIGH QUALITY CONTENT CREATION +++ Manual Article Spinning (Thread Here) ***
    Content Creation, Blogging, Articles, Converting Sales Copy, Reviews, Ebooks, Rewrites
    {{ DiscussionBoard.errors[5710787].message }}
  • Profile picture of the author Osman Safdar
    Thanks for your input Georj.... that's very helpful.
    {{ DiscussionBoard.errors[5710937].message }}

Trending Topics