![]() | | ||||||||
| | #1 |
| HyperActive Warrior War Room Member Join Date: Mar 2009
Posts: 244
Thanks: 6
Thanked 86 Times in 43 Posts
|
Fellow bloggers, I've been having tremendous trouble keeping my "thin" sites up on a very robust dedicated server at Servint. Too many wp-cron.php and MySQL queries and Googlebot crawls have overloaded my server's RAM & CPU at least 5-6 times every day for many weeks now, and my EPN income has basically dropped to toast as a result of traffic loss. I solved the wp-cron.php problem with a DISABLE and then re-enable-through-staggered-cron-jobs strategy, but today my loads were worse than ever, and this is what I heard from Servint. According to Servint, myself and a number of other people with "thin" sites are getting slammed by Google's image bot, which is now recrawling every time a new image loads, effectively stopping traffic for anyone with autoblog sites with lots of live image reloads. It is appears to be a deliberate, not unintentional, tactic by Google to bring down thin-style and Mage-style autoblogs. Servint is suggesting I put a hard block on Google's imagebot, hoping that it will not also block SE traffic to the long tail eBay keywords associated with each auction. Any advice is welcome, thanks |
| | |
| | #2 |
| Chasing Freedom in 2011 War Room Member Join Date: Apr 2007 Location: Denton, TX, USA
Posts: 1,356
Thanks: 124
Thanked 279 Times in 180 Posts
|
This is exactly why it is so important to have unique content on your blogs. Benjamin Ehinger |
| | |
| | |
| | #3 |
| Suzanne War Room Member Join Date: Jan 2007 Location: Virginia, USA.
Posts: 11,714
Blog Entries: 1 Thanks: 1,468
Thanked 4,739 Times in 2,624 Posts
|
Disallow the image bot from crawling with this in your robots.txt file User-agent: Googlebot-Image Disallow: / |
| | |
| | |
| | #4 | |
| HyperActive Warrior War Room Member Join Date: Mar 2009
Posts: 244
Thanks: 6
Thanked 86 Times in 43 Posts
| Quote:
However, I don't know if blocking the image files will also block long-tail KW search traffic. And Bill Platt, above, sounds pretty certain that blocking any part of their bots will keep all Google traffic away forever. If he's right about that, this will not work, because I do depend on Google traffic. Anybody who knows the real truth about that assertion, appreciate if you could weigh in. | |
| | |
| | #5 | |
| Suzanne War Room Member Join Date: Jan 2007 Location: Virginia, USA.
Posts: 11,714
Blog Entries: 1 Thanks: 1,468
Thanked 4,739 Times in 2,624 Posts
| Quote:
| |
| | ||
| | |
| | #6 |
| Senior Warrior Member Join Date: Apr 2007 Location: US of A
Posts: 2,190
Thanks: 47
Thanked 258 Times in 212 Posts
|
Here is a little secret about Google. Google will say one thing then do whatever they want. That is fact. People who think Google is an Ethical Company may want to read this: Google Antitrust Hearing Witness List | WebProNews |
| | |
| | |
| | #7 | |
| Suzanne War Room Member Join Date: Jan 2007 Location: Virginia, USA.
Posts: 11,714
Blog Entries: 1 Thanks: 1,468
Thanked 4,739 Times in 2,624 Posts
| Quote:
| |
| | ||
| | |
| | #8 | |
| HyperActive Warrior Join Date: Jun 2011 Location: UK
Posts: 429
Thanks: 8
Thanked 61 Times in 56 Posts
| Quote:
| |
| | |
| | #9 | |
| HyperActive Warrior War Room Member Join Date: Mar 2009
Posts: 244
Thanks: 6
Thanked 86 Times in 43 Posts
| Quote:
The only images on my sites are eBay auction images generated by phpBay calls. Every page has lots of pictured auctions on it, and I could choose an option which shows only the text, but that would probably stop click-thrus and defeat the purpose. Servint is still working on this. Even after installing the Google imagebot block on images, loads still exceedingly high, so Google appears to be ignoring the block. Or they're checking to see if some other bot or script else is now overloading it. Freaking nightmare! | |
| | |
| | #10 | |
| The Nature Lady War Room Member Join Date: Nov 2004 Location: , , USA.
Posts: 4,169
Thanks: 3,294
Thanked 3,724 Times in 2,060 Posts
| Quote:
Whatever is going on - I sure hope it stops soon. | |
|
Sal Temporary - copy editing sale - $3 per 500 words. Quality PLR Ebooks and Reports: Mind/Language, Weight, Pet/Dog, Disaster WF fundraiser WSOs: Ken Strong - KimW | ||
| | |
| | #11 | |
| HyperActive Warrior War Room Member Join Date: Mar 2009
Posts: 244
Thanks: 6
Thanked 86 Times in 43 Posts
| Quote:
Well I feel a little bit less concerned now. Not first-hand experience or a direct quote from Matt Cutts, but "I know a fella..." If I had a dollar for every time someone said they are sure Google works this way or that way, because they heard it from someone so it must be true, I'd have enough money to forget about Google altogether. ![]() No one on this forum, so far as I know, has been able to suss out Google's inner workings and decisions. I also know a fella, many fellas and ladies, who put a temporary Googlebot block on their sites when they were getting hit too hard, then later removed it, and their traffic was still incoming. I'm sure if you leave the block on ferever, you won't get any traffic from Google, but doubt a temporary one screws you with Google for life. But I'll update soon when I see if my dilemma gets fixed. So far, Servint hasn't been able to fix it...and I was told (by a fella) that they were the best & most experienced hosting company | |
| | |
| | #12 | |
| Plundering the Web War Room Member Join Date: Feb 2007 Location: , , .
Posts: 4,851
Thanks: 804
Thanked 1,200 Times in 887 Posts
| Quote:
a secret to those who don't read or watch the news. Google will answer this with the same that all the other bogus committees get: Using google is a choice. Paul | |
| How to Make Money off Facebook: Login to your account. Deactivate your account. Get your butt to work.
| ||
| | |
| | #13 |
| SEO Strategist War Room Member Join Date: Jun 2010
Posts: 6,533
Thanks: 355
Thanked 1,993 Times in 1,274 Posts
|
The real question is what are you running cron jobs for (email), I don't get why your running the cron job on a thin site? Is it a new auto-blog? Most times crons are the reason for servers bogging down. I really doubt it's G bots bogging down your host, If it's true, that's a host problem. |
| | |
| | |
| | #14 |
| SEO Strategist War Room Member Join Date: Jun 2010
Posts: 6,533
Thanks: 355
Thanked 1,993 Times in 1,274 Posts
| |
| | |
| | |
| | #15 |
| HyperActive Warrior Join Date: Nov 2009 Location: Thailand
Posts: 147
Thanks: 11
Thanked 24 Times in 19 Posts
|
I have 2 very large sites, one has over 12000 pictures on it the other has over 8000 pictures on it. In the case of the site with 8000 pictures on it I decided it probably wasn't a good idea to let google image bot crawl it for various reasons so I disallowed it in the robots txt. I did see a big drop in traffic to that site but it was all google image traffic that never converted anyway as people were just coming to the site to steal pics. I would normally never disallow google image bot as it can bring in people that stick around such as for my site with 12000 pics. |
| | |
| | #16 | |
| SEO Strategist War Room Member Join Date: Jun 2010
Posts: 6,533
Thanks: 355
Thanked 1,993 Times in 1,274 Posts
|
Still, what's the cron job for, that could very well be the root of the problem? If it's actually G bot killing your server (still sounds like cron jobs is the issue) you can always throttle all G bots in your Google Webmaster Tools Admin (Site configuration > Settings > Crawl rate). I think blocking the Image bot is a bad idea. I have a couple of sites that also get a lot of traffic from Google Images, I would hate to lose that traffic to my Adsense sites by blocking Google Images. [Unrelated to the OP problem] One trick I use on my image sites is to break the Google Image frame, that allows traffic to not visit my site & grab the image, that's helped control the direct image link from Google Images. My Image traffic doesn't have a choice, they have to land on my page when they click the Google Image thumbnail. Quote:
| |
| | ||
| | |
| | #17 | |
| HyperActive Warrior War Room Member Join Date: Mar 2009
Posts: 244
Thanks: 6
Thanked 86 Times in 43 Posts
| Quote:
So your certainty that it's still a cron problem is probably off-base. And when you say that if it really is Gbot bogging down my server, it's a host problem, offer me a host solution. As mentioned above, I picked Servint (after a long time with Shared accounts on Hostgator) because practically everyone in the IM world speaks glowingly of Servint's reputation & reliability. I still have yet to hear any poster counter that and say they suck. But if you really feel I am being ill-served by Servint, please recommend a superior choice...I have no loyalty and will switch in an instant if you know of one that will do the job "right." | |
| | |
| | #18 |
| SEO Strategist War Room Member Join Date: Jun 2010
Posts: 6,533
Thanks: 355
Thanked 1,993 Times in 1,274 Posts
|
Ok, I thought when you mentioned the cron job in OP you was doing something on your own. How about WP plugins? Have you turned on/off plugins & watch the server logs? I think a few plugins exist that might let you monitor the wp-cron.php jobs, might be worthwhile? |
| | |
| | |
| | #19 |
| Warrior Member Join Date: Sep 2011
Posts: 22
Thanks: 0
Thanked 0 Times in 0 Posts
|
Hope disallowing the Google Image bot can solve the problem, but at the same time be ready to lose some of the visitors who seems to be appear from Google Image search.
|
| | |
| | #20 |
| HyperActive Warrior Join Date: Dec 2010
Posts: 475
Thanks: 42
Thanked 68 Times in 47 Posts
|
Maybe kill the wp-cron job? How to stop wp-cron.php from firing! · Mellowhost I stay away from wordpress so I don't know much about this, but it sure looks like the job sucks a lot of cpu all on its own.. and often. |
| | |
| | #21 |
| Warrior Member Join Date: Apr 2009 Location: Jacksonville, FL
Posts: 18
Thanks: 1
Thanked 4 Times in 4 Posts
|
If google image bot still crawls your site, then you can totally ban it from accessing your website with .htaccess.
|
|
Many Great Keyword Domains www.ModernDomains.com | |
| | |
| | #22 | |
| SEO D'Artagnan War Room Member Join Date: Aug 2009
Posts: 4,980
Thanks: 476
Thanked 1,090 Times in 701 Posts
| Quote:
More likely situation is that Google has some other reason like links or popularity of some of the pages linking to you why it keeps coming back. The more often you are crawled is usually a good sign. Punishing you with image bots but not deindexing you makes no sense. | |
| | ||
| | |
| | #23 |
| The SEO Wonder Kid War Room Member Join Date: Aug 2011 Location: Secret Lab
Posts: 263
Thanks: 45
Thanked 38 Times in 29 Posts
|
No, this is not Google bots' doing for sure. It's either your host or the very same wp-cron plugin's fault. You really don't need such a plugin.
|
| | |
| | #24 |
| HyperActive Warrior War Room Member Join Date: Jun 2007 Location: Tooele, UT, USA.
Posts: 220
Thanks: 0
Thanked 28 Times in 15 Posts
|
Did anyone ever find a solution for this problem? Did blocking the Imagebot fix it? I've got two servers at Servint and one of them has been continuously crushed by Googlebots for the past 36 hours. Tech support is baffled, and seems not to have any memory of this. Charlie PS Strange thing - I have two servers with Servint, but only one of them has this problem. Updating the robots file to exclude the image bot has not helped. |
| | |
| | |
| | #25 |
| HyperActive Warrior War Room Member Join Date: Jun 2007 Location: Tooele, UT, USA.
Posts: 220
Thanks: 0
Thanked 28 Times in 15 Posts
|
Problem solved -the cause was a plugin for Wordpress called WP Linknet that I bought here on the WF. A catastrophe. Problem fixed. Charlie |
| | |
| | |
| | #26 |
| HyperActive Warrior War Room Member Join Date: Jun 2007 Location: Tooele, UT, USA.
Posts: 220
Thanks: 0
Thanked 28 Times in 15 Posts
|
Problem solved -the cause was a plugin for Wordpress called WP Linknet that I bought here on the WF. A catastrophe. Problem fixed. Charlie |
| | |
| | |
| | #27 |
| Warrior Member Join Date: Feb 2012
Posts: 1
Thanks: 0
Thanked 0 Times in 0 Posts
|
This is an interesting and true topic aside from the plugin, I really notice it a lot. In any event, most people are not looking from traffic from Yandex or Baidu, so blocking out those subnet blocks with CIDR notation is good way to reduce traffic; don't bust a hefty .htaccess or anything crazy because the preload on apache for those directives can increase the read IO overhead.. For Google you really need to get a Google Webmasters account for each of your domain and configure a low, low, low crawl delay, you can do this for Bing to I think; especially you guys that like to do 100 domains on a single shared account base package with no caching installed on any of them. And do a robots.txt file for the crawlers that respect those directives. And of course, install caching.. W3TC is nice, just use the html rewrite caching, keeping everything loaded through apache is the fastest and most reliable speeds( faster speeds lend to SEO). wp-crons transient functions + traffic + hooks + bad plugin = ouch. |
| | |
![]() |
|
| Tags |
| attacking, bots, google, sites, thin |
| Thread Tools | |
| |
![]() |