Correct WP Robot.txt File

by alanaj
5 replies
  • WEB DESIGN
  • |
I'm putting a new WP site together and want to make sure I have the right info in the robot.txt file. I've pieced it together from searching WF and google. I don't know what the two lines with ?s are for and if they are both needed. Is everything typed correctly and in the right order? I appreciate your help.

Sitemap: http://www.mysite.com/sitemap.xml.gz

# For global/any
User-agent: *
Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /xmlrpc.php
Disallow: /wp-content/plugins/
Disallow: /wp-content/cache/
Disallow: /wp-content/themes/
Disallow: /*?
Disallow: /*?*

Allow: /wp-content/uploads/

#Google Image
User-agent: Googlebot-Image
Disallow:
Allow:/*

#Google Adsense
User-agent: Mediapartners-Google*
Disallow:
Allow:/*

#Google Adsense
User-agent: Adsbot-Google*
Disallow:
Allow:/*

#digg mirro
User-agent: duggmirror
Disallow:/
#correct #file #robottxt #wordpress
  • Profile picture of the author alanaj
    Any Wordpress pros or knowledgeable robot ninjas out there?
    {{ DiscussionBoard.errors[8823576].message }}
  • Profile picture of the author nettiapina
    I wouldn't bother to have that much stuff in the usual case, but disallowing the addresses you do seems pretty safe to me. The red lines should target every URL with a parameter (ie. question mark), which should be fine as long as you're not using ?p=123 style addresses or something custom that places those in the URLs.

    "Allow: /wp-content/uploads/" doesn't seem to do anything, and explicitly allowing Google bots seems redundant. You're not blocking them from anything vital in the part above that.

    Here's a good article:
    Robots.txt Tutorial

    And here's my own WP-generated robots.txt which works perfectly well:
    http://nettiapina.fi/robots.txt

    @Luke, use the force! Err, I mean Google.
    Signature
    Links in signature will not help your SEO. Not on this site, and not on any other forum.
    Who told me this? An ex Google web spam engineer.

    What's your excuse?
    {{ DiscussionBoard.errors[8824278].message }}
  • Profile picture of the author alanaj
    @Nettiapina - thanks for your input. I read through the info from the link you provided but it didn't really click with my questions about the question marks. The robot file you have in your site comes with the standard WP theme install. That's what's in my theme now, but I want to add more.

    Anyone else using a custom robot.txt file have any feedback on what I have posted above?
    {{ DiscussionBoard.errors[8825497].message }}
    • Profile picture of the author nettiapina
      Originally Posted by alanaj View Post

      I read through the info from the link you provided but it didn't really click with my questions about the question marks.
      Read again then. Starting from "Robots.txt Wildcard Matching". But I've also provided an explanation in my first reply, so I'm not sure what you're looking for. And if it's something more specific you should really just use Google (with the words in quotation marks above in your search string).

      Why do you want to add more lines in the first place? It's not likely that Google would get anything from those extra directories because the contents are mostly just WordPress-related PHP files. You don't HTML-link those to anywhere, and requesting them directly shouldn't even produce any interesting results. Messing with something that works doesn't seem like a good idea to me.

      In some cases your site might be exposing your content in a way that you don't want in Google search results. That's a well-grounded reason to modify robots.txt. Search results from site search are a typical example of something your SEO consultant might want to remove from Google.

      If you're looking to increase security then .htaccess is your friend.

      (Nitpick: robots.txt is a core feature. It's got nothing to do with themes.)
      Signature
      Links in signature will not help your SEO. Not on this site, and not on any other forum.
      Who told me this? An ex Google web spam engineer.

      What's your excuse?
      {{ DiscussionBoard.errors[8826665].message }}

Trending Topics