Go Back   WarriorForum - Internet Marketing Forums > The Warrior Forum > Adsense / PPC / SEO Discussion Forum
Register Blogs FAQ Social Groups CalendarHelp Desk

Reply
 
LinkBack Thread Tools
Old 08-25-2010, 12:22 PM   #1
HyperActive Warrior
 
Join Date: Aug 2010
Location: Cincinnati, OH
Posts: 105
Thanks: 0
Thanked 5 Times in 5 Posts
Default Googlebot unaffected by * in robots.txt?

So if you had user agent as * and disallow to / it applies to all bots BUT googlebot? As googlebot needs it's own rules?

Seems there's some misinformation out there. Anybody clear it up?
millerb7 is offline   Reply With Quote
Old 08-25-2010, 01:09 PM   #2
HyperActive Warrior
 
HunterSnake's Avatar
 
Join Date: Jun 2010
Posts: 184
Thanks: 16
Thanked 46 Times in 33 Posts
Social Networking View Member's Twitter Profile  View Member's YouTube Profile
Contact Info
Send a message via ICQ to HunterSnake
Default Re: Googlebot unaffected by * in robots.txt?

No offense to you, MillerB7, for I am sure you're a good, decent person, but I've a small rant. Thanks in advance for your understanding.

<rant>

Wow... WTF... Are there only like 10 of us on this forum who have actually read Google Webmaster Central's FREELY AVAILABLE information...?

Block or remove pages using a robots.txt file - Webmaster Tools Help

Here's some more information, too...

Robots exclusion standard - Wikipedia, the free encyclopedia

It would be interesting to know statistics for the amount of people "practicing SEO" who don't know how to use the very search engines they're trying to "conquer" the rankings of. So many questions that get posted here, short and long alike, can almost always be answered with a simple Google search... Or by reading THE FAQ STICKIED AT THE TOP OF THIS BOARD...

</rant>

Hover Coupon | Affordable Domain Names and Personalized, Surname Email for Less
Site5 Rebate Claim Form | Save Big on Reliable Shared and Reseller Web Hosting
HunterSnake is offline   Reply With Quote
Old 08-25-2010, 01:55 PM   #3
HyperActive Warrior
 
Join Date: Aug 2010
Location: Cincinnati, OH
Posts: 105
Thanks: 0
Thanked 5 Times in 5 Posts
Default Re: Googlebot unaffected by * in robots.txt?

I do appreciate the rant, sort of

Good thing I'm not really practicing SEO, just "practicing" SEO. I'm always at odds with folks who post things like your comments... I'm sort of under the impression the forum wouldn't even exist if everybody just went to google and searched for their answer?

Isn't the whole "human interaction" aspect of the forum sort of why we come and ask questions? Just a thought.

Most of the time, at least with me, I am not looking for a straight yes/no answer. I want some conversation about the matter at hand, to FULLY understand it. Again, I can find this by reading google search queries and all... but then were back to why have the forum at all? Although I suppose without the forum, some of the information wouldn't even be available on google at all. Chicken or the egg I suppose.



EDIT:

HAH! Just figured it out, actually, by re-reading. The part that confused me was due to multiple requests in a robots.txt. Not a single request.

It wasn't the fact I didn't read, I've read all those links you gave me. It was the fact that art of seo had a part that I "thought" was contradicting what else I had read. The fact it was though, was that googlebot had it's own line in the robots.txt, thus the user agent * did not apply to googlebot for a disallow / as it already had line in there.

Just a confusion with the text. Besides, who doesn't enjoy having their mind picked
millerb7 is offline   Reply With Quote
Old 08-25-2010, 02:07 PM   #4
HyperActive Warrior
 
HunterSnake's Avatar
 
Join Date: Jun 2010
Posts: 184
Thanks: 16
Thanked 46 Times in 33 Posts
Social Networking View Member's Twitter Profile  View Member's YouTube Profile
Contact Info
Send a message via ICQ to HunterSnake
Default Re: Googlebot unaffected by * in robots.txt?

No worries, Miller. =) Thanks for understanding. My rant is in part that I have seen and/or answered this exact or very similar question... many times in the last week. It can get a little... me want to strangle... LOL

The Art of SEO is a great book and I just checked the sections pertaining to Robots.txt and I don't see a contradiction between them and the links that I posted.

Feel free to AIM/ICQ me if you'd like more personal assistance regarding this issue and I'll explain it to you in a more thorough fashion.

Hover Coupon | Affordable Domain Names and Personalized, Surname Email for Less
Site5 Rebate Claim Form | Save Big on Reliable Shared and Reseller Web Hosting
HunterSnake is offline   Reply With Quote
Old 08-25-2010, 02:13 PM   #5
HyperActive Warrior
 
Join Date: Aug 2010
Location: Cincinnati, OH
Posts: 105
Thanks: 0
Thanked 5 Times in 5 Posts
Default Re: Googlebot unaffected by * in robots.txt?

Yeah just edited my post, as you were typing apparently.

It was a big example with 3 different functions in robot.txt given on page 244. I was assuming they were 3 separate examples, but they were about the same .txt so the first directive aimed at googlebot superseded the directive aimed at "*".
millerb7 is offline   Reply With Quote
Old 08-25-2010, 02:18 PM   #6
HyperActive Warrior
 
HunterSnake's Avatar
 
Join Date: Jun 2010
Posts: 184
Thanks: 16
Thanked 46 Times in 33 Posts
Social Networking View Member's Twitter Profile  View Member's YouTube Profile
Contact Info
Send a message via ICQ to HunterSnake
Default Re: Googlebot unaffected by * in robots.txt?

Yes. Robots.txt is powerful in that you can give generic (*) directions but also apply directions specific to given spiders. This is great when maybe you don't want Google's regular bot to crawl something, but you want the AdSense bot to crawl it so you can display ads.

Hover Coupon | Affordable Domain Names and Personalized, Surname Email for Less
Site5 Rebate Claim Form | Save Big on Reliable Shared and Reseller Web Hosting
HunterSnake is offline   Reply With Quote
Old 08-25-2010, 02:21 PM   #7
HyperActive Warrior
 
Join Date: Aug 2010
Location: Cincinnati, OH
Posts: 105
Thanks: 0
Thanked 5 Times in 5 Posts
Default Re: Googlebot unaffected by * in robots.txt?

Okay 1 last question then I'm done bugging you.

Is it alphanumeric or does specific commands outweigh general?
millerb7 is offline   Reply With Quote
Old 08-25-2010, 02:30 PM   #8
HyperActive Warrior
 
HunterSnake's Avatar
 
Join Date: Jun 2010
Posts: 184
Thanks: 16
Thanked 46 Times in 33 Posts
Social Networking View Member's Twitter Profile  View Member's YouTube Profile
Contact Info
Send a message via ICQ to HunterSnake
Default Re: Googlebot unaffected by * in robots.txt?

That's a good question, and you're not bugging me. Everyone's entitled to a rant once in a while, right?

If User-agent: * disallows /foot/
but then User-agent: Googlebot allows /foot/, that means that only Googlebot may crawl /foot/ but nothing else may. So, * is a global wildcard but if you make a specific statement for a specific bot, that overwrites whatever * says for the specified statements.

Hover Coupon | Affordable Domain Names and Personalized, Surname Email for Less
Site5 Rebate Claim Form | Save Big on Reliable Shared and Reseller Web Hosting
HunterSnake is offline   Reply With Quote
Old 08-25-2010, 02:42 PM   #9
HyperActive Warrior
 
Join Date: Aug 2010
Location: Cincinnati, OH
Posts: 105
Thanks: 0
Thanked 5 Times in 5 Posts
Default Re: Googlebot unaffected by * in robots.txt?

Quote:
Originally Posted by HunterSnake View Post
That's a good question, and you're not bugging me. Everyone's entitled to a rant once in a while, right?

If User-agent: * disallows /foot/
but then User-agent: Googlebot allows /foot/, that means that only Googlebot may crawl /foot/ but nothing else may. So, * is a global wildcard but if you make a specific statement for a specific bot, that overwrites whatever * says for the specified statements.
Got ya

And yes, Lord knows I've ranted my fair share.
millerb7 is offline   Reply With Quote
Reply

  WarriorForum - Internet Marketing Forums > The Warrior Forum > Adsense / PPC / SEO Discussion Forum

Tags
googlebot, robotstxt, unaffected

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off



All times are GMT -6. The time now is 08:44 PM.