I need some programming advice

4 replies
What type of script, preferably php, would you recommend to use to scrape data from one website while being able to parse the information, and to publish it to your website? And ideas?
#advice #programming
  • Profile picture of the author ezkl
    You can scrape, parse, and repurpose data with any scripting language that can make an HTTP request and parse DOM (read: all of them). What platform are you looking to publish the data on; i.e. are you updating a database, generating a WordPress post, uploading an HTML file that you'll expose directly, etc?
    {{ DiscussionBoard.errors[7799331].message }}
  • Profile picture of the author KingRoyal
    Well we would be taking the information scraped, images and text from a website, and putting them into a custom post type on wordpress. We already have this, but the way it is setup is a complete disaster : (
    {{ DiscussionBoard.errors[7799730].message }}
    • Profile picture of the author KirkMcD
      Originally Posted by KingMighty View Post

      Well we would be taking the information scraped, images
      IMAGES!!!
      Besides the fact that you are just coping copywritten stuff without permission, image owners tend to send large bills when you use their stuff without permission.

      Anyway, if you already have something that does most of what you want, just fix it.
      Why do you want to start from scratch?
      {{ DiscussionBoard.errors[7802360].message }}
  • Profile picture of the author David V
    I was just reading this article by Zane Matthew last week about web scraping in WP, which led me to this php html DOM parser script on sourceforge. Worth browsing.
    I would scour the code of the many scraper scripts and plugins that are out there for some ideas also.
    There's many on WordPress.org, Codecanyon.net, and SourceForge.net
    The scraping is likely the easiest part since there's so many scraper scripts and plugins available already.
    The creative part will be choosing the best way to store the data then translate it into your CPT.
    It shouldn't be too bad if you tear apart the many CPT creation plugins at WordPress and see how they create them.
    So it would be a matter of automating it and grabbing the data from the scraper.
    Sounds like you'll be busy.
    {{ DiscussionBoard.errors[7799956].message }}

Trending Topics