Scraping/crawling data

15 replies
Not sure if this is the correct board to post this on, but I would like to discuss algorithmic data gathering here.

Specifically, I have been more and more involved in scraping data for various clients and I've gotten quite good at it.

Recently I acquired about 1.5GB of UK property sales data for a client, and it got me thinking:

- is there an organized market where one could sell such data?
- what is the legal status of scraping? some resources seem to say it's legal because you're just gathering information that's already publicly posted, some say it's in some sort of grey area or dependent on the specific website's terms of service
- has anyone made good money through scraping and would like to share their experience?
#data #scraping or crawling
  • Profile picture of the author TripLoop
    What did you use to scrap data?
    {{ DiscussionBoard.errors[11469575].message }}
    • Profile picture of the author Monsignor
      Python's "scrapy" library.
      Signature

      The intelligent investor is a realist who sells to optimists and buys from pessimists. - B. Graham

      {{ DiscussionBoard.errors[11469938].message }}
      • Profile picture of the author minimax
        Can we do scrap from any chrome extension?
        {{ DiscussionBoard.errors[11487599].message }}
    • Profile picture of the author CyberSEO
      Originally Posted by TripLoop View Post

      What did you use to scrap data?

      Usually it's PHP. There is nothing difficult in content scrapping. Even for people with a basic knowledge base in regular expressions.
      Signature
      CyberSEO Pro - the ultimate all-in-one autoblogging WordPress plugin, powered by OpenAI GPT-4, Anthropic Claude, Google Gemini Pro, Midjourney, DALL-E 3 and Stable Diffusion XL
      {{ DiscussionBoard.errors[11489660].message }}
    • Profile picture of the author Johnny12345
      Originally Posted by TripLoop View Post

      What did you use to scrap data?

      You can use a Python script to scrape the web. Al Sweigart has a chapter about it in his book, Automate the Boring Stuff with Python. He also has a Youtube channel.

      You can read that chapter online for free on his website. Go to:

      https://automatetheboringstuff.com

      He has a ton of free Python training.

      John
      {{ DiscussionBoard.errors[11498191].message }}
  • Profile picture of the author bartology
    What you're talking about would be lead generation. The value of the data would be in if you could analyze it to be a set of data that could convert to leads for a business that could use the data. If you can do that, you can make money.
    {{ DiscussionBoard.errors[11469856].message }}
    • Profile picture of the author Monsignor
      Thanks. But how would I go about selling the leads? Is there an organized marketplace for this sort of thing, or do I just find and contact businesses directly?
      Signature

      The intelligent investor is a realist who sells to optimists and buys from pessimists. - B. Graham

      {{ DiscussionBoard.errors[11469939].message }}
      • Profile picture of the author rocky80
        If your selling real estate leads, there is a huge market place for it. I'm not sure where you are but look into fiverr, or even post an ad on craigslist, or a local forum or meetup with investors. Investors will line up if you have good data.
        {{ DiscussionBoard.errors[11487838].message }}
  • Profile picture of the author ryanbiddulph
    The only glaring issue with this approach; it lacks a warm, personal, genuine, human touch. Stick with building bonds to build a thriving business, the right way. I Junked-Trashed a few more emails today from bloggers scraping my data and email, sending me unsolicited emails, Silly approach because you chase out of fear, while I attract out of love. Peep my Seen At Page to see how it has worked for me hehehehe
    Signature
    Ryan Biddulph helps you to be a successful blogger with his courses, manuals and blog at Blogging From Paradise
    {{ DiscussionBoard.errors[11469974].message }}
    • Profile picture of the author Monsignor
      Dude, I'm a programmer, not a blogger or a marketer. I am looking to market my skills, not change professions.
      Signature

      The intelligent investor is a realist who sells to optimists and buys from pessimists. - B. Graham

      {{ DiscussionBoard.errors[11479153].message }}
  • Profile picture of the author MSutton
    There definitely is a market for scraped data and scarping services. Is it legal or illegal? Like you said, it really depends. It's widely thought of as "gray hat" or black hat by those that know nothing about it and those that profit from it don't even worry about it much lol..

    There is a lot of uses for scarped data. I suggest you search google and visit blackhat forums (no not all information on blackhat forums is shady). The more research you do on scrapping, the more you understand how to use it for profit whether using the scraped data yourself or by selling it or by performing scraping services for other businesses. We live in the information age and data is valuable.

    Businesses scrape their competitors' websites for data they can use to get a better picture of their competitor and to compete better against them. Companies with old, large websites even scrape their own websites to find specific data they need to change, remove or analyze.

    Say what you will about scarping contact info for selling to them, but it works. If it didn't no one would scrape contact info. look at it this way, why does telemarketing and spam still exist? Because a good percentage of people will buy from people selling via telemarketing and spam. People are suckers for a good "deal", then they cry when they get scammed. If people buy into scams from spammers, they will also buy legit products if "pitched" properly. But scraped contact info can be used in many ways, not just for sending emails. Salespeople can used scarped contact info when they are looking for more leads to target. Nothing wrong with it. And it beats searching for it manually. Scraping is more efficient if done properly so it saves them time, even if it costs them money.

    Lots of uses for scraped data.
    {{ DiscussionBoard.errors[11479081].message }}
  • Profile picture of the author pectel
    In sort, it is not legal, if you sell and also depend upon To whom you are selling.
    {{ DiscussionBoard.errors[11486690].message }}
    • Profile picture of the author Pierre Allain
      pectel is it legal to scrap website and use data online ?
      {{ DiscussionBoard.errors[11487826].message }}
  • Profile picture of the author pectel
    You can sell it on black where opposite company do not care - from where these data is coming
    But You would get very less amount.

    You can not sell users pics, messages, emails messages and other sensitive information, even you can not sell user log.
    You can not sell browser cookies and cache files.
    {{ DiscussionBoard.errors[11487969].message }}
  • Profile picture of the author Sunny P
    Programming is very Logical That's why it is Very DIficult
    {{ DiscussionBoard.errors[11488016].message }}

Trending Topics