Keeping search bots out--pls double check my robots.txt file...

5 replies
Hello there,

I want to keep Google and any other search bots out of one of my directories. My understanding is I can do this by creating a robots.txt file. So I uploaded it to public_html...is that the correct spot to have it in the root? And here is the code...does it look okay?

Code:
User-Agent: *
Disallow: /folder_name
I get paranoid that I'll keep Google out of the whole site. I just want them out of this one directory only.

Thanks!
#bots #check #double #file #keeping #outpls #robots.txt #robotstxt #search
  • Profile picture of the author MemberWing
    To be sure - signup for Google webmasters tools and it has robots.txt checking facility there.
    {{ DiscussionBoard.errors[624630].message }}
    • Profile picture of the author Alan Petersen
      Thanks for the tip. I'm trying to verify the site but they're unable to verify due to this error:

      We've detected that your 404 (file not found) error page returns a status of 200 (Success) in the header.
      Not finding on the answer on how to fix it from google, any suggestions?
      Signature
      {{ DiscussionBoard.errors[624648].message }}
  • Profile picture of the author K.Saravana Kumar
    Banned
    [DELETED]
    {{ DiscussionBoard.errors[624653].message }}
    • Profile picture of the author Alan Petersen
      Originally Posted by K.Saravana Kumar View Post

      Looks like you are using mod_rewrite or .htaccess modified. you need to fix that return 404 error code for unfound pages .
      Cool thanks! I had this in my .htaccess file:

      Code:
      ErrorDocument 404 /likes/index.php
      So it was going back to my index page vs. displaying a 404 error page. I think I read that this was a good thing to do so I put it in there so folks wouldn't get a 404 page. But I took it out and now it works! And google has verified my site. All just edit the 404 page with an ad or something. :-)

      Thanks!
      Signature
      {{ DiscussionBoard.errors[624686].message }}
  • Profile picture of the author tgrpublishing
    Just a note - robots.txt doesn't GUARANTEE that bots won't access the directory. Only ethically written bots will avoid it. But, that does mean at least the big 3 will
    {{ DiscussionBoard.errors[625882].message }}
    • Profile picture of the author ehicks727
      Originally Posted by TigerPublishing View Post

      Just a note - robots.txt doesn't GUARANTEE that bots won't access the directory. Only ethically written bots will avoid it. But, that does mean at least the big 3 will
      Yeah, that's the plain truth. I wrote a custom analytics program, and I have to maintain a list of over 1000 bots now so I can screen out non-human traffic.

      The worst is the scrapers who just insert random text into the browser user-agent.

      Don't think that putting something in your robots.txt will keep all the bots out. Most bots roaming the Internet aren't respectful.
      {{ DiscussionBoard.errors[626458].message }}

Trending Topics