Webmaster tool robot.txt file

6 replies
When I check my site in my google webmaster tools section, under crawler access it is showing a robots.txt file that was apparently downloaded yesterday.

In the box it has this:
User-agent: *
Disallow:

Sitemap: http......tore.com/sitemap.xml.gz

I did not create this robots.txt file, does anyone know what it is. I thought you only needed these to stop google from crawling certain pages.

PS - Don't have 15 posts yet and can't place link properly. So I couldn't leave whole url in. But it is my main site url.
#file #robottxt #tool #webmaster
  • Profile picture of the author SmartWeb
    Originally Posted by Apollo77 View Post

    When I check my site in my google webmaster tools section, under crawler access it is showing a robots.txt file that was apparently downloaded yesterday.

    In the box it has this:
    User-agent: *
    Disallow:

    Sitemap: http......tore.com/sitemap.xml.gz

    I did not create this robots.txt file, does anyone know what it is. I thought you only needed these to stop google from crawling certain pages.

    PS - Don't have 15 posts yet and can't place link properly. So I couldn't leave whole url in. But it is my main site url.
    If you have not created this robot.txt, then may be some plugin you are using might have done this.
    robot.txt is used for telling google about both to crawl and not to crawl.
    If google find robot.txt, it would try to get the list of all pages/links you have, so that it visit those page and index them.
    also, you can ask google not to visit certain folders/page by mentioning in the robot.txt as disallowed pages.

    Its always good to have robot.txt
    {{ DiscussionBoard.errors[2676958].message }}
  • Profile picture of the author SteveJohnson
    If you're running WordPress, you won't see a robots file in your files structure because it doesn't exist in the real world. WP generates and serves the file in response to a robot request.
    Signature

    The 2nd Amendment, 1789 - The Original Homeland Security.

    Gun control means never having to say, "I missed you."

    {{ DiscussionBoard.errors[2678211].message }}
    • Profile picture of the author Apollo77
      So this was created by wordpress then. I use the google sitemap plugin, then in google webamaster tools I add my site and then my sitemap.

      Should I be concerned about this particular note then:
      User-agent: *
      Disallow:
      Sitemap: http......tore.com/sitemap.xml.gz


      It's not telling it to disallow my normal sitemap, just /sitemap.xml.gz, whatever that is. Is this ok.

      If google find robot.txt, it would try to get the list of all pages/links you have, so that it visit those page and index them.
      Should I add a robot.txt file and ad all my main url from site that I want indexed? will this insure they are indexed?
      {{ DiscussionBoard.errors[2679637].message }}
      • Profile picture of the author SmartWeb
        You don't need to do anything, robot.txt tells which files to allow , which not to allow and then location of the sitemap file. (sitemap file can be sitemap.xml or sitemap.xml.gz )
        so your robot.txt seems correct only.
        {{ DiscussionBoard.errors[2681730].message }}
        • Profile picture of the author Apollo77
          Thanks guys, I have two sitemap files for some reason, the one I submitted (/sitemap.xml) and one google added (/sitemap.xml.gz)

          I was also a little worried cause my new site is not showing up for my main keywords, it says that 18 of 44 urls are indexed though, are all the urls supposed to be indexed? I included my category urls in my sitemap, is that ok or is it too many urls, should I just submit posts and pages?
          {{ DiscussionBoard.errors[2686754].message }}
  • Profile picture of the author SteveJohnson
    Google about robots files. One command per line. There are no disallows in this particular file. The Sitemap: line is telling the robot the location of the gzipped sitemap.

    No, you shouldn't add a robots file, you already have one. If you have non-WordPress addresses that should be in the sitemap, you can add them on the sitemap plugin's settings page.

    You can read about how WordPress natively handles SEO here: Search Engine Optimization for WordPress WordPress Codex
    Signature

    The 2nd Amendment, 1789 - The Original Homeland Security.

    Gun control means never having to say, "I missed you."

    {{ DiscussionBoard.errors[2680539].message }}

Trending Topics