Question about Scrapebox

by internetbillionaire Posted: 12 years ago 18 replies

I'm a newbie with Scrapebox and this might sound stupid, but I don't understand how to scrape a big list of urls at once.

I mean if I use for an example a lot of drupal footprints. It takes under 1 minute and it says harvester is completed and there are only under 300 urls (no duplicate domains) in the list. If I want to harvest all footprints I have to export not completed keywords to keyword list and start harvesting again and over and over again the same process.

I don't understand how I could scape all those keywords (footprints) at once by clicking start harvesting button at once.

I use 30 semi-private proxies and 100 connections for harvesting and other settings are default setting I guess.

Thank you

#search engine optimization #question #scrapebox

MikeFriedman 12 years ago

Create a file with all your footprints. Save it as a .txt file.

Put all your keywords in the keyword box.

Then hit the 'M' button at the top. Load your footprint file. It merges the keywords with the footprints.

Now scrape.
- [ 1 ] Thanks
patco 12 years ago

I am really curious why do you people still use ScrapeBox. It could harm your website, those spammy comments won't help your site anymore (it was a good strategy maybe a few years ago...)
- [1] reply
- MikeFriedman 12 years ago
  
  Nobody said a word about leaving spammy comments. Scrapebox does so much more than that. If that is all you think it is useful for, you have been missing out.
  
  [ 1 ] Thanks
  
  [1] reply
MikeFriedman 12 years ago

That sounds like awful proxies. Try it without them once.
- [2] replies
- internetbillionaire 12 years ago
  
  No, they are buyproxies.orgs proxies... The problem is not proxies.
  
  I think I can't solve this issue
- tutupious 12 years ago
  
  Hey Mike, what else are you using scrapebox for?
MikeFriedman 12 years ago

The proxies could still be dead to Google. you need to try to troubleshoot this so you want to try it without proxies to see if it is a problem with the proxies or problem with your settings. if it works well without the proxies you know the proxies are the problem, if not, you know its a problem with your settings.
- [1] reply
- internetbillionaire 12 years ago
  
  No, they are not dead to Google, I can test them easily in GSA SER and they work fine.
  
  The problem is in settings but I don't know where.
  
  I live in Finland (you possible have seen some grammar errors in text), I think I have to try to test some other scraper...
MikeFriedman 12 years ago

Fine. You don't want to try that simple test and want to argue then I can tell you are going to be a pain in the ass to try to help. I'm not interested anymore. Good luck.
- [1] reply
- internetbillionaire 12 years ago
  
  OK, you are right. I tested the scrape without any proxies and I got about 2k urls (duplicate domains removed). Every queries (with every keywords) were completed.
  
  But I don't want to do these things in the future without proxies
  
  [2] replies
timpears 12 years ago

Wow, thanks you guys, I learned a lot from this thread.
extremeboy 12 years ago

make the thread twice of your proxy number and make delay in search per query to 60-80 Seconds your IPs will work alot longer than you think on Google specially
- [1] reply
- MikeFriedman 12 years ago
  
  60-80 second delay
  
  No way. If you are doing a big scrape that could take 4-5 days.
JSProjects 12 years ago

An alternative is to use a VPN that has a lot of different servers to choose from. This is what I do since Google since it's super-simple to switch servers once Google gets cranky.

Join the discussion

Next Topics on Trending Feed

22

Question about Scrapebox

internetbillionaire • 12 years ago • 18 replies

I'm a newbie with Scrapebox and this might sound stupid, but I don't understand how to scrape a big list of urls at once. I mean if I use for an example a lot of drupal footprints. It takes under 1 minute and it says harvester is completed and there are only under 300 urls (no duplicate domains) in the list. If I want to harvest all footprints I have to export not completed keywords to keyword list and start harvesting again and over and over again the same process.