Googlebot Experiment FAIL - Please Explain!

10 replies
A few days ago, I started a thread about Article Submission and Social Bookmarking, where I referred to an article I had just posted on my website called Discerning Health Disinformation on the Web.

As my account is too new to include a link in the text of my post, and so I simply referred to it in the text without a link. However, there is a link to the homepage of my website in my signature. (See below)

As part of the discussion was to make sure that my article had been indexed by Google before pointing to it using Social Bookmarking.

Just now, I did a google search for "Discerning Health Disinformation on the Web" and my article was nowhere to be found. However, as it turns out, my post on this forum referencing the article has been indexed...twice!

So what gives? How come the googlebot did not follow the link in my signature, arrive at the article, and index it??

Some additional tidbits:

This forum does not use the "nofollow" attribute, so the bot should have followed the link... right?

Yes, there is a direct path to the article from the homepage of my site.

Yes, I did update the sitemap.

Can anybody explain this to me? :confused:
#experiment #explain #fail #googlebot
  • Profile picture of the author bgmacaw
    Originally Posted by drmattnd View Post

    So what gives? How come the googlebot did not follow the link in my signature, arrive at the article, and index it??
    ----------------
    Can anybody explain this to me? :confused:
    First, you need to directly link to the page you want indexed: Discerning Health Disinformation on the Web. Don't depend upon linking externally to the index page to get your backpage content indexed. While it will happen eventually, it can be rather slow.

    Next, you should use a keyword friendly url, not an abbreviation. Don't use disinfo.html but discerning-health-disinformation-on-the-web.html. This helps Google properly index the content and, usually, do it quicker.
    {{ DiscussionBoard.errors[2728016].message }}
    • Profile picture of the author rosetrees
      If you do the search in quotes - you are no2 of only 2 results.

      If you omit the quotes, Google throws up 58 pages of results. Your article is too new to be ranked. The best seo, on a regularly ranked site, can take weeks or months to have full effect.

      Google loves the Warrior forum and indexes and ranks posts on here in record time. But the WF is huge, has good PR and age on its side.

      Be patient and continue backlinking, directory adding, etc
      {{ DiscussionBoard.errors[2728653].message }}
    • Profile picture of the author drmattnd
      Originally Posted by bgmacaw View Post

      Next, you should use a keyword friendly url, not an abbreviation. Don't use disinfo.html but discerning-health-disinformation-on-the-web.html. This helps Google properly index the content and, usually, do it quicker.
      Thanks for the info: I tried to send you the following PM but apparently I can't send PMs yet. I understand why.. but it is frustrating.

      Thanks for your comment yesterday on my post.

      I have taken your recommendation and changed the url to:

      ...articles/discerning-health-disinformation-on-the-web.html

      Unfortunately, I have not -yet- been able to create a redirect in .htaccess for the old url.

      If you wouldn't mind taking a moment to update the url in the link that you created in your post, so that nobody gets a 404 I would much appreciate it!!

      Also, if you happen to know of a good resource for .htaccess in Expression Web 2.0 I would appreciate it. For some reason my software won't properly identify the file.

      Many thanks!!
      {{ DiscussionBoard.errors[2730024].message }}
  • Profile picture of the author yukon
    Banned
    [DELETED]
    {{ DiscussionBoard.errors[2730053].message }}
    • Profile picture of the author RemingtonSteele
      You're getting a 404 because the page has been renamed to:

      discerning-health-disinformation-on-the-web.html

      Drmattnd, you should do a 301-redirect of the old page to the new page. There are multiple ways to do this. The easiest way is probably to create the disinfo.html file again and put the redirect code in there. That way, any requests to disinfo.html will automatically be redirected to the new page. (Make sure you put disinfo.html in the same location as before -- in your "articles" directory.)

      Here's the code to put in disinfo.html:

      Code:
      <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
      <html xmlns="http://www.w3.org/1999/xhtml">
      <head>
      <meta http-equiv="refresh" content="0;url=http://www.drmattnd.com/articles/discerning-health-disinformation-on-the-web.html" />
      <title>This page has moved</title>
      </head>
      <body>
      </body>
      </html>
      {{ DiscussionBoard.errors[2730366].message }}
      • Profile picture of the author drmattnd
        Originally Posted by RemingtonSteele View Post

        Drmattnd, you should do a 301-redirect of the old page to the new page. There are multiple ways to do this. The easiest way is probably to create the disinfo.html file again and put the redirect code in there. That way, any requests to disinfo.html will automatically be redirected to the new page. (Make sure you put disinfo.html in the same location as before -- in your "articles" directory.)
        Thanks, I went ahead and did the 301 redirect. Although ideally I would like to use .htaccess as I want my site to validate XHTML 1.0 strict. Haven't figured it out yet though!:confused:
        {{ DiscussionBoard.errors[2734188].message }}
        • Profile picture of the author tpw
          Another factor may be the way that GoogleBot actually crawls...

          They hit the WF thread, index it, then catalog all of the links in the page...

          Then a few days later, they go back and crawl all of the new links they have found...

          So in this case, a few days after they found the WF thread, they crawled your home page...

          On crawling your home page, they indexed it, then cataloged all of the links in the page...

          A few days later, they will crawl all of the links found on your home page...

          If the article in question is linked from the front page of your site, they will crawl it roughly 6-8 days from the time they found the WF thread... (just guessing on how many days between crawls)

          If the article in question was not linked from the from page of your site, it may have to take in several levels of your site to find the article in question, and therefore, it may take GoogleBot more than a few days to find your article...
          Signature
          Bill Platt, Oklahoma USA, PlattPublishing.com
          Publish Coloring Books for Profit (WSOTD 7-30-2015)
          {{ DiscussionBoard.errors[2734458].message }}
        • Profile picture of the author RemingtonSteele
          Originally Posted by drmattnd View Post

          Thanks, I went ahead and did the 301 redirect. Although ideally I would like to use .htaccess as I want my site to validate XHTML 1.0 strict. Haven't figured it out yet though!:confused:
          This tutorial might help:

          Code:
          http://www.webweaver.nu/html-tips/web-redirection.shtml
          {{ DiscussionBoard.errors[2735184].message }}
  • Profile picture of the author christopher jon
    'eh, I see you as the #1 result (with and without quotes).

    Lesson learned here: Have some patience before crying that the sky is falling.
    {{ DiscussionBoard.errors[2734635].message }}

Trending Topics