Googlebot unaffected by * in robots.txt?

8 replies
  • SEO
  • |
So if you had user agent as * and disallow to / it applies to all bots BUT googlebot? As googlebot needs it's own rules?

Seems there's some misinformation out there. Anybody clear it up?
#googlebot #robotstxt #unaffected
  • Profile picture of the author HunterSnake
    No offense to you, MillerB7, for I am sure you're a good, decent person, but I've a small rant. Thanks in advance for your understanding.

    <rant>

    Wow... WTF... Are there only like 10 of us on this forum who have actually read Google Webmaster Central's FREELY AVAILABLE information...?

    Block or remove pages using a robots.txt file - Webmaster Tools Help

    Here's some more information, too...

    Robots exclusion standard - Wikipedia, the free encyclopedia

    It would be interesting to know statistics for the amount of people "practicing SEO" who don't know how to use the very search engines they're trying to "conquer" the rankings of. So many questions that get posted here, short and long alike, can almost always be answered with a simple Google search... Or by reading THE FAQ STICKIED AT THE TOP OF THIS BOARD...

    </rant>
    {{ DiscussionBoard.errors[2516283].message }}
  • Profile picture of the author millerb7
    I do appreciate the rant, sort of

    Good thing I'm not really practicing SEO, just "practicing" SEO. I'm always at odds with folks who post things like your comments... I'm sort of under the impression the forum wouldn't even exist if everybody just went to google and searched for their answer?

    Isn't the whole "human interaction" aspect of the forum sort of why we come and ask questions? Just a thought.

    Most of the time, at least with me, I am not looking for a straight yes/no answer. I want some conversation about the matter at hand, to FULLY understand it. Again, I can find this by reading google search queries and all... but then were back to why have the forum at all? Although I suppose without the forum, some of the information wouldn't even be available on google at all. Chicken or the egg I suppose.



    EDIT:

    HAH! Just figured it out, actually, by re-reading. The part that confused me was due to multiple requests in a robots.txt. Not a single request.

    It wasn't the fact I didn't read, I've read all those links you gave me. It was the fact that art of seo had a part that I "thought" was contradicting what else I had read. The fact it was though, was that googlebot had it's own line in the robots.txt, thus the user agent * did not apply to googlebot for a disallow / as it already had line in there.

    Just a confusion with the text. Besides, who doesn't enjoy having their mind picked
    {{ DiscussionBoard.errors[2516488].message }}
  • Profile picture of the author HunterSnake
    No worries, Miller. =) Thanks for understanding. My rant is in part that I have seen and/or answered this exact or very similar question... many times in the last week. It can get a little... me want to strangle... LOL

    The Art of SEO is a great book and I just checked the sections pertaining to Robots.txt and I don't see a contradiction between them and the links that I posted.

    Feel free to AIM/ICQ me if you'd like more personal assistance regarding this issue and I'll explain it to you in a more thorough fashion.
    {{ DiscussionBoard.errors[2516524].message }}
  • Profile picture of the author millerb7
    Yeah just edited my post, as you were typing apparently.

    It was a big example with 3 different functions in robot.txt given on page 244. I was assuming they were 3 separate examples, but they were about the same .txt so the first directive aimed at googlebot superseded the directive aimed at "*".
    {{ DiscussionBoard.errors[2516550].message }}
  • Profile picture of the author HunterSnake
    Yes. Robots.txt is powerful in that you can give generic (*) directions but also apply directions specific to given spiders. This is great when maybe you don't want Google's regular bot to crawl something, but you want the AdSense bot to crawl it so you can display ads.
    {{ DiscussionBoard.errors[2516560].message }}
  • Profile picture of the author millerb7
    Okay 1 last question then I'm done bugging you.

    Is it alphanumeric or does specific commands outweigh general?
    {{ DiscussionBoard.errors[2516576].message }}
  • Profile picture of the author HunterSnake
    That's a good question, and you're not bugging me. Everyone's entitled to a rant once in a while, right? :p

    If User-agent: * disallows /foot/
    but then User-agent: Googlebot allows /foot/, that means that only Googlebot may crawl /foot/ but nothing else may. So, * is a global wildcard but if you make a specific statement for a specific bot, that overwrites whatever * says for the specified statements.
    {{ DiscussionBoard.errors[2516603].message }}
    • Profile picture of the author millerb7
      Originally Posted by HunterSnake View Post

      That's a good question, and you're not bugging me. Everyone's entitled to a rant once in a while, right? :p

      If User-agent: * disallows /foot/
      but then User-agent: Googlebot allows /foot/, that means that only Googlebot may crawl /foot/ but nothing else may. So, * is a global wildcard but if you make a specific statement for a specific bot, that overwrites whatever * says for the specified statements.
      Got ya

      And yes, Lord knows I've ranted my fair share.
      {{ DiscussionBoard.errors[2516656].message }}

Trending Topics