How do you hide robots.txt

8 replies
I thought it would be a good idea to try to prevent SEs from indexing my one time offer urls, so I added them to my robots.txt. But it's real easy for anyone to go to mysite.com/robots.txt to view that file.

It's actually not a really big deal, but if I wanted to hide my robots.txt file, how do you do that?

Or, maybe the question should be what is the best way to hide html files from SEs and people?
#hide #robotstxt
  • Profile picture of the author Ross Dalangin
    Make it password protected and then just tell to your selected possible buyers the link and the password to access it.
    {{ DiscussionBoard.errors[21486].message }}
  • Profile picture of the author bendiggs
    As Ross said don't allow robots.txt to be accessed. The easiest way to do that would probably be to use the authentication capabilities of your server. If you are using Apache look here Apache Tutorial: .htaccess files for details on how to configure authentication and use .htaccess. Good luck.
    {{ DiscussionBoard.errors[21613].message }}
    • Profile picture of the author Robert Plank
      Why not just add the whole folder, i.e. /offers to robots.txt, and add your offers like /offers/thing1.html and disable directory indexing?
      {{ DiscussionBoard.errors[21637].message }}
      • Profile picture of the author Ryan_Taylor
        Originally Posted by Robert Plank View Post

        Why not just add the whole folder, i.e. /offers to robots.txt, and add your offers like /offers/thing1.html and disable directory indexing?
        Duh . . . Thanks Robert. I didn't have my coffee yet at the time.

        In the end, if someone wants to take the time to find my OTO page and purchase, that's fine by me
        Signature

        {{ DiscussionBoard.errors[22094].message }}
        • Profile picture of the author edynas
          Banned
          Originally Posted by Ryan_Taylor View Post

          Duh . . . Thanks Robert. I didn't have my coffee yet at the time.

          In the end, if someone wants to take the time to find my OTO page and purchase, that's fine by me
          If that's the case then why go thru the trouble of cloaking the robot.txt?
          When you in the end don't mind people buying your oto then i would just add that file to the robot.txt to exclude indexing.

          You also need to think this...what kind of people whould go to your robot.txt file and look at it to find.... is it likely a customer or rather someone looking for a download page?

          But if you real do need a cloaking of your robot.txt I guess this is the fastest simplest way. Not tested so there might be some mistakes in it.

          1. create a .htaccess file with this line in it

          RewriteRule ^(.*)/robot.txt robot.php [L]

          2. create a file robot.php which will be authenticating the user. Is it a bot or a human.
          List of bots are at Search engine robots
          sample code for double authenticating in php is at PHP Search Engine Bot Authentication - eKstreme.com
          {{ DiscussionBoard.errors[23280].message }}
          • Profile picture of the author ravijayagopal
            Here's an interesting idea:

            If you don't worry about hiding anything, then let's say your OTO's get indexed.

            And someone finds them, or figures out the link (on a SE, or by looking at your robots.txt, whatever).

            They now know that they're not supposed to be getting this deal. They think they're really "cool" or "smart" for "hacking" their way to your "special" offer.

            They still need to "buy" something, I guess.

            So, you get your unexpected sale.

            And your customer goes back smiling, thinking that they got the better end of the deal, because they just happened to trick you into giving them a great deal.

            Win-win.

            No?

            My $0.02.

            - Ravi Jayagopal
            {{ DiscussionBoard.errors[44982].message }}
    • Profile picture of the author mrblack
      Originally Posted by bendiggs View Post

      As Ross said don't allow robots.txt to be accessed. The easiest way to do that would probably be to use the authentication capabilities of your server. If you are using Apache look here Apache Tutorial: .htaccess files for details on how to configure authentication and use .htaccess. Good luck.
      If you are suggesting making the robots.txt file password protected via .htaccess then of course the web crawlers would not be able to access it (unless you did something fairly complicated and custom - assuming that is possible :confused.

      The below suggestion to use a robots meta tag is one good method and the other (although more tedious) would be to create a separate robots.txt file within the directory of the content you would not want accessed. If you place your more secretive/selective offers within their own directories then you can place individual robots.txt files within these directories. This way you would not have to include their information within your root directory robots.txt file.

      If you do not have your more private/time sensitive offers within their own directory then of course the meta tag is the most logical solution in my opinion. Besides htaccess rewrite rules are not as available in most cases if you move to a windows server (since most Windows servers utilize IIS instead of Apache) and URL rewrites are written a little differently since they use unique rewrite engines to Windows (ISAPI Rewrite, OPURL, IIS Rewrite, etc...).

      So all in all for ease of management in case you do move to a different server the non htaccess methods make more since.

      Useful site - The Web Robots Pages
      {{ DiscussionBoard.errors[86880].message }}
  • Profile picture of the author Felu
    Originally Posted by Ryan_Taylor View Post

    I thought it would be a good idea to try to prevent SEs from indexing my one time offer urls, so I added them to my robots.txt. But it's real easy for anyone to go to mysite.com/robots.txt to view that file.

    It's actually not a really big deal, but if I wanted to hide my robots.txt file, how do you do that?

    Or, maybe the question should be what is the best way to hide html files from SEs and people?
    Ryan

    Just remove your robots.txt file. And add this meta tag (between <head></head> in the document) to your OTO page and other pages you don't want to be indexed.

    Code:
    <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
    Felu
    {{ DiscussionBoard.errors[22490].message }}

Trending Topics