How to keep google from indexing new blog

5 replies
Okay that sounds like a strange question since most people want to get the googlebot to visit. But I just bought a more advanced wordpress theme than I'm used to and I need time to work out the kinks first before it goes live.

I know there has to be some code you can but in to keep the bot out. Could someone please let me know what that is and in which folder do I put it in?

My hosting account has several domains and I don't want anything that might mess up the googlebot from visiting those sites. Does some kind generous hearted warrior know what I can do?

Thanks in an advance so much...
Dolores
#blog #google #indexing
  • Profile picture of the author SteveJohnson
    Make a new file named robots.txt, and in it, put
    Code:
    User-agent: *
    Disallow: /
    Upload that file to the root folder of your site. That will tell any robot or spider that pays attention that you don't want the site indexed.
    Signature

    The 2nd Amendment, 1789 - The Original Homeland Security.

    Gun control means never having to say, "I missed you."

    {{ DiscussionBoard.errors[31958].message }}
    • Profile picture of the author dpepper
      Thank you Steve,

      That would be the one, appreciate the help.

      Have a blessed and prosperous day!
      Dolores
      Signature

      Dolores Pepper
      .................................................. ..............

      {{ DiscussionBoard.errors[32477].message }}
  • Profile picture of the author hadnow
    Yes in wordpress > Admin Panel > Settings > Privacy > check the radio box I would like to block search engines, but allow normal visitors. Best done via the wp-admin panel rather than via .htaccess file manually
    {{ DiscussionBoard.errors[5144935].message }}
  • Profile picture of the author RobKonrad
    Hi Dolores,

    if there's stuff on those pages that you don't want to be publically available for protection reasons (sounds a bit like it), you might need more protective measures, as people will still be able to access content...
    Signature
    ================================================== ===
    This blog is awesome: http://www.robkonrad.com/blog. Read it.
    ================================================== ===
    {{ DiscussionBoard.errors[5148599].message }}
  • Profile picture of the author candyo0
    Use a robots.txt robots exclusion file
    The robots exclusion standard, as articulated in the robots.txt protocol, says that spiders should look for a plain text file called robots.txt in a site’s top (root) directory. To exclude all robots from crawling directories called sales and images, the following syntax is used:
    User-agent: *
    Disallow: /sales/
    Disallow: /images/

    A common error is to forget the trailing slash – we even spotted this error in a recent google blog post.
    User-agent: googlebot
    Disallow: /sales

    will stop any file beginning with sales* from being indexed – not usually what you want. In this case, we have limited the exclusion to googlebot. See our article on search engine spiders for a list of spider bots associated with each of the major search engines.
    Signature
    {{ DiscussionBoard.errors[5159960].message }}

Trending Topics