Scraping websites - use PHP and Regexp or something else?
I am currently busy learning and doing PHP to help me make tools and other cool stuff for my websites. My first project I want to take on is to be able to make a website scraper to affiliate site script (all with permission from vendors of course).
Seems like it's pretty straightforward to get the html file, but then next I would want to extract the data. The standard solution seems to be using regular expressions, but then I read some other guys suggesting not using PHP for this stuff at all but rather some Python library?
Next I would want to get the data to my website. Would you need to store it in a MySql database or you could you go straight from array to to website?
I'm a newb with PHP though I do know programming basics, anyway, is the process I outlined above the right way to do it? I don't want to be headed down the wrong path!
Are you using WordPress? Have you tried qSandbox yet?
:)
The Hydrurga WSO - Rank Your Site #1 And Score Over The Penguin Updates!
Are you using WordPress? Have you tried qSandbox yet?
"Who Else Wants To Learn The Hidden Secrets Of Quickly Turning 10 Minutes Into A $474.99/Month Income Generator?"