![]() |
| ||||||||
|
|||||||
![]() |
|
|
LinkBack | Thread Tools |
|
|
#1 |
|
Warrior Member
War Room Member
Join Date: Jul 2008
Location: , , Bahamas.
Posts: 29
Thanks: 1
Thanked 0 Times in 0 Posts
|
Hey everyone just a quick question. Should I let unknown bots crawl my site? There are like 5 different bots with names like unknown robot or bots. What robots besides the obvious ones like google, yahoo and the others should I put in my robots.txt file? Any help would be appreciated.
|
|
|
|
|
|
#2 |
|
HyperActive Warrior
War Room Member
Join Date: May 2004
Location: Perth, Australia.
Posts: 406
Thanks: 2
Thanked 63 Times in 55 Posts
|
I would strongly suggest again blocking these unknown bots are they are most likley part of the major Search Engines(SEs) checking your site.
There is an increasing number of websites that are now using clocking to artifically increase their search engine rankings. The search engines (especially BigG) now have implemented a range of other bots that come from different IP addresses, and don't identify themselves as coming from Google at all. IP Cloacking works like this :- a bot visits a website, the website determines from it IP address that its a bot, and so it gives it a bunch of keyword rich text to spider and index. A human visits the site, the site determined that the visitor is not from a search engine, and now redirect the human visitor to another website (usually an affiliate page). So the SEs are now trying to find these pages, by sending in bots that look and behave as humans, and others that have no distinguishing details at all. They want to see if the content they see is substantially different from their previous visit. If so, then the site may come up for a human review. So, my recommendation is don't block these bots, is you have nothing to hide.. Hope this helps Bruce |
|
♦ Get Indexed Faster, Visit PingMe Now!!, and get a FREE backlink as well!
♦ Get Instant BackLinks To Any Site You want by running your own Blog Farm ♦ Stop Google SideWiki from displaying unfavorable comments on your WordPress Blog. Beta testers needed for Wordpress Plugin : SideWiki Blocker ♦ How I Get 50-150 1-way links per day, everyday : Find out what I do! |
|
|
|
|
|
|
#3 |
|
Warrior Member
War Room Member
Join Date: Jul 2008
Location: , , Bahamas.
Posts: 29
Thanks: 1
Thanked 0 Times in 0 Posts
|
Thanks for the help Bruce
|
|
|
|
|
|
#4 |
|
HyperActive Warrior
Join Date: May 2008
Location: USA
Posts: 228
Blog Entries: 22
Thanks: 8
Thanked 21 Times in 20 Posts
|
What Bruce mentioned is valid, but I just wanted to offer another point of view. In the beginning I didn't care who came by my websites, and I was happy to have the visitors. True a lot of the automated robots (or bots) were related to the search engines, but in the last few years and months I started to see an increasing number of unknown sources. These weren't random, they were hitting the server relentlessly. So I took some advice from a web development company who was fed up with these suspicious connections, and decided to implement a long list for robots.txt, added some rules to .htaccess, and started monitoring everything with a web application firewall called mod_security. Why? Because of the following benefits, which I'm sure you've seen on other websites like botsense.com,
Sorry to sound negative, but with some of the security issues I've dealt with, it becomes an advantage to take a defensive position to protect myself and my client's assets. |
|
My WF blog for your reading pleasure: (1) Building a revenue and profit model (2) Forming strong customer bonds (3) Earn more money with DB marketing (4) Better money with better marketing
|
|
|
|
|
|
|
#5 |
|
Active Warrior
Join Date: Jan 2009
Posts: 33
Thanks: 1
Thanked 2 Times in 1 Post
|
Just take a look at some recent trends:
Google 50% Yahoo 23% MSN 10% AOL 6% ASK 2% Others 9% So as you can see the another 9% do not make a big difference, with exeption of Alexa. Then is a very good idea to allow the "well know" spiders and block the rest. |
|
|
|
|
|
|
![]() |
|
| Tags |
| bots, unknown |
| Thread Tools | |
|
|
![]() |