How much for a project like this....

10 replies
I need software that will crawl a list of 200+ websites.

Find specific listings that contain keywords "equipment," "supplies" and other variations.

After finding these listings - grab data about the listing including 1) Date 2)Images 3) Description 4) Physical Address.

Then, relist this information on another website.

The site that will re-list the data is a listing website with these existing data fields.

How much should I expect to pay for something like this to be built and what skillsets would someone capable of this need to have?

Thanks in advance.
#project
  • Profile picture of the author TheCrazyCoder
    Do you expect crawl to run on server or on your desktop?
    Do you plan to distribute crawl for your customers or it will be development only for you?
    {{ DiscussionBoard.errors[9295576].message }}
  • Profile picture of the author TheCrazyCoder
    Do you have list of 200+ sites or they should be found by crawl?
    {{ DiscussionBoard.errors[9295581].message }}
    • Profile picture of the author ekimura
      I have a list of the websites. The crawl will run on a server and it will be development only for one website.
      {{ DiscussionBoard.errors[9295747].message }}
      • Profile picture of the author jimjones
        Hi,

        I have worked a lot with crawlers and web data extraction. This task is quite simple but also requires alot of work before having a fully automated solution.

        Every required field (Date, Images, Description, Physical Address) of these 200 websites needs to be annotated for the crawler otherwise the crawler doesn't "know" what to extract.

        Edit: To answer your initial question. I would say you have to pay at least 1000 USD for a fully automated solution. But keep in mind you have to redo the annotation of a website if this website changes their layout (e.g. Description field changed its location).
        {{ DiscussionBoard.errors[9295852].message }}
        • Profile picture of the author ekimura
          That is about what I expected it to cost....

          I'm assuming the area that requires lots of work is the relisting part.

          Most of the websites I'm looking to collect data from are using the same platform/format/theme.

          Do you know much will this affect the speed of the website that is compiling the data? Also, would this impose any danger to the websites I'm scraping? Maybe I'm wrong to assume it will create any drag..
          {{ DiscussionBoard.errors[9296224].message }}
          • Profile picture of the author jimjones
            Originally Posted by ekimura View Post

            That is about what I expected it to cost....

            I'm assuming the area that requires lots of work is the relisting part.

            Most of the websites I'm looking to collect data from are using the same platform/format/theme.

            Do you know much will this affect the speed of the website that is compiling the data? Also, would this impose any danger to the websites I'm scraping? Maybe I'm wrong to assume it will create any drag..
            Relisting is rather an easy part. Just insert or update the crawled data in your database. Of course you need to crawl a unique article number too to identify the product in your database and either insert or update it. That shouldn't have much a big impact on your website performance.

            If you do the crawling wrong, the websites will ban your ip address. You have to be polite when crawling a page. Obey the robots.txt, make a pause between every request, etc...

            Keep in mind that a lot of websites uses javascript/ajax to present their data. Your crawler must support this.

            The part that requires alot of work is the data annotation. Show the crawler where the images are, where the article number and description is, etc.
            On every website.

            Hope that helps.

            Originally Posted by kpmedia

            You'll never find something like this for that cheap. I bet something that complex is 10x that price.
            Thats true if you have to start from scratch.
            {{ DiscussionBoard.errors[9296979].message }}
            • Profile picture of the author pphillips001
              I am puzzled why this has been posted here.

              Isn't there a section on this site for tendering this sort of work?

              Or are you just after a general approximation?
              {{ DiscussionBoard.errors[9331127].message }}
        • Profile picture of the author kpmedia
          Originally Posted by jimjones View Post

          Edit: To answer your initial question. I would say you have to pay at least 1000 USD .
          You'll never find something like this for that cheap. I bet something that complex is 10x that price.

          Good luck to you.

          Some devs won't get near this either, FYI, because it sounds like a spam tool. And most devs want nothing to do with that -- it will tarnish their rep, and is high risk.
          {{ DiscussionBoard.errors[9296374].message }}
          • Profile picture of the author onsmith
            Do you already have your "relisting" website set up? Or would you need that created as well?
            {{ DiscussionBoard.errors[9296416].message }}
            • Profile picture of the author ekimura
              "Relisting" website is already set up. Some are already using the website to create listings for upcoming events.

              I'm not trying to create a spambot - just trying to automate a process.

              Thanks for the input.
              {{ DiscussionBoard.errors[9296514].message }}

Trending Topics