5 replies
  • SEO
  • |
Not sure if this is the correct sub-forum to post this. Sorry if it isn't.

I would like to scrape tens of thousands of Amazon product listings, pictures, text, all.

Is there off-the-shelf software I can use to do this? I saw something about ScrapeBox on another thread.

If not, I can code so that's no problem.

What kind of an infrastructure do I need?

How many proxies?

What kind?

Where?

I ran a little test with code I wrote and it looks like my IP address got put "on hold" for an hour or so. Is this the ISP doing it or Amazon? So, if I run it off a server somewhere out there rather than my own computer at home, can I hit Amazon reasonably hard (every 15 to 30 seconds) and get away with it or do I need dozens of proxies to be able to pull, say, 10,000 product listings?

Thanks!
#amazon #scrapping
  • Profile picture of the author Adam Roy
    Being familiar with code, why not just use their API and have them send that information to you, rather than scraping all their pages?
    {{ DiscussionBoard.errors[9459566].message }}
    • Profile picture of the author john5Jhx
      Originally Posted by Adam Roy View Post

      Being familiar with code, why not just use their API and have them send that information to you, rather than scraping all their pages?
      Everything on a listing is not available through the API. If that were the case, you are right, I would not need scrapping.

      Thanks.
      {{ DiscussionBoard.errors[9459755].message }}
  • Profile picture of the author shahocean
    You can use Import.io or KiminoLabs to get all the data. If you want step by step guide : Here you go

    Good Luck!
    Signature

    Sagar Shah
    www.SagarShah.co

    {{ DiscussionBoard.errors[9459681].message }}
    • Profile picture of the author john5Jhx
      Originally Posted by shahocean View Post

      You can use Import.io or KiminoLabs to get all the data. If you want step by step guide : Here you go

      Good Luck!
      Thanks. Import.io looks interesting.

      I would still like to understand what I'd have to do if I wanted to roll out my own infrastructure to do this with fully custom code. I am starting to experiment with various ideas right now. We'll see what happens.
      {{ DiscussionBoard.errors[9459788].message }}
      • Profile picture of the author shahocean
        Originally Posted by john5Jhx View Post

        Thanks. Import.io looks interesting.

        I would still like to understand what I'd have to do if I wanted to roll out my own infrastructure to do this with fully custom code. I am starting to experiment with various ideas right now. We'll see what happens.
        I am sure you will love it. Just one thing that I learned is, do check the data seriously. You will get many insights which you can't buy anywhere. (Personal Experience)
        Signature

        Sagar Shah
        www.SagarShah.co

        {{ DiscussionBoard.errors[9460132].message }}
  • Profile picture of the author astri
    [DELETED]
    {{ DiscussionBoard.errors[9459797].message }}

Trending Topics