Google Webmaster Tool + Crawl Errors & SEO

8 replies
  • SEO
  • |
Hello everyone,

Earlier today, I decided to take a few extra minutes to explore my Google Webmaster Tool account. In the past all I ever did was login and submit my new site, and sitemap.

But apparently I should be paying a little more attention to the sites that I add. Because one of the sites apparently has 139 crawl errors:

Not followed (redirect error) - 1 error in total. It then says that the URL was detected on Aug 2, 2010.

Not Found (404) - 138 errors in total. Under the "linked from column", it says "unavailable" for most of the links, and the rest say "h**p://mydomain.net/sitemap.xml.gz". All the URL's were detected at a later date.

Now with that all out of the way, I have a few questions for the more experienced Warriors. For starters, why did Google not find these links in the first place? Are these crawl errors normal? And do these errors have any negative consequences on my rankings? And one other thing, if Google says the link was "detected" does that mean that the error is now fixed, and the URL is indexed by Google?

Thank you,
Gordon
#crawl #errors #google #seo #tool #webmaster
  • Profile picture of the author GeorgeKuipers
    These errors might bring a negative influence on SEO only if the number of them is exceeding some threshhold (lets say 5% out of total indexed pages). Google often makes mistakes with a correct definition of urls, so he cannot make solid negative decisions based on this metric.

    PS. With mine 100 000 of pages indexed, I always have 100-200 not found urls. Most of them are irrelevant since they look like site.com//page (double slash or something like this).

    Regards
    George
    Signature
    {{ DiscussionBoard.errors[2446145].message }}
    • Profile picture of the author TheChanger
      Originally Posted by GeorgeKuipers View Post

      These errors might bring a negative influence on SEO only if the number of them is exceeding some threshhold (lets say 5% out of total indexed pages). Google often makes mistakes with a correct definition of urls, so he cannot make solid negative decisions based on this metric.

      PS. With mine 100 000 of pages indexed, I always have 100-200 not found urls. Most of them are irrelevant since they look like site.com//page (double slash or something like this).

      Regards
      George
      Thanks for the response!

      So if Google says that the link was "detected". Is it safe to assume that the error is now fixed, and the url's indexed?

      Just curious, because it seems like a lot of errors for such a small blog (200 url's in web index).

      Thanks,
      Gordon
      {{ DiscussionBoard.errors[2446224].message }}
      • Profile picture of the author GeorgeKuipers
        Originally Posted by Gordon Hay View Post

        Thanks for the response!

        So if Google says that the link was "detected". Is it safe to assume that the error is now fixed, and the url's indexed?

        Just curious, because it seems like a lot of errors for such a small blog (200 url's in web index).

        Thanks,
        Gordon
        Hi Gordon

        Please do not confuse the definitions here: not a link was detected, but a page URL was detected.
        The URL was detected means that somewhere in the net (or on your site), Google has bumped into a link to this URL and was trying to open the page, but could not make it. No assumptions here that the problem is solved.

        Then Google says that those urls were not found (404 error code). 404 error code does not necessarily mean that the page is not found (not existing). The only thing the 404 reflects is that the page could not be opened.
        And looking at your example url, I might have an explanation why: Google cannot open gz files.

        Regards
        George
        Signature
        {{ DiscussionBoard.errors[2446251].message }}
        • Profile picture of the author TheChanger
          Originally Posted by GeorgeKuipers View Post

          Hi Gordon

          Please do not confuse the definitions here: not a link was detected, but a page URL was detected.
          The URL was detected means that somewhere in the net (or on your site), Google has bumped into a link to this URL and was trying to open the page, but could not make it. No assumptions here that the problem is solved.

          Then Google says that those urls were not found (404 error code). 404 error code does not necessarily mean that the page is not found (not existing). The only thing the 404 reflects is that the page could not be opened.
          And looking at your example url, I might have an explanation why: Google cannot open gz files.

          Regards
          George
          Hi George,

          So I guess the real problem is with the "sitemap.xml.gz" file that Google XML Sitemaps plugin submits to Google. So I will delete "sitemap.xml.gz" from Google Webmaster Tools and just use "sitemap.xml" instead. And hopefully this fixes all those crawl errors.

          Thanks for all the help,
          Gordon
          {{ DiscussionBoard.errors[2447235].message }}
  • Profile picture of the author ursimrankhanna
    Hi,
    nice help full information thx for share.
    {{ DiscussionBoard.errors[2446848].message }}
  • Profile picture of the author GeorgeKuipers
    Hi Gordon, you can try this, should be working.
    Still not clear what are the other 137 urls return the 404 code. All of the are ending with gz?

    George
    Signature
    {{ DiscussionBoard.errors[2447939].message }}
    • Profile picture of the author TheChanger
      Originally Posted by GeorgeKuipers View Post

      Hi Gordon, you can try this, should be working.
      Still not clear what are the other 137 urls return the 404 code. All of the are ending with gz?

      George
      Hi George,

      I think I may have discovered the problem. When I was looking over the install instructions for the Google XML Sitemaps plugin I discovered that I missed a step.

      Step 2 of the install says the following:

      Use your favorite FTP program to create two files in your WordPress directory (that's where the wp-config.php is) named sitemap.xml and sitemap.xml.gz and make them writable via CHMOD 666

      So this is were I messed up, I created the two files but failed to change the file permissions.

      So this is what I did:
      • Changed the file permissions to the correct permissions
      • Rebuilt the sitemaps
      • Re-submitted both sitemap.xml & sitemap.xml.gz to Google Webmaster Tools.
      So hopefully the above actions will take care of all future crawl errors.

      Thanks,
      Gordon
      {{ DiscussionBoard.errors[2449597].message }}
  • Profile picture of the author GeorgeKuipers
    Yeap, I think this should work.

    Good luck!
    George
    Signature
    {{ DiscussionBoard.errors[2451565].message }}

Trending Topics