7 replies
  • SEO
  • |
Hello,

I have tools that auto bookmark url's that I feed it however as my syndication network grows grabbing these url's manually is taking more time then it is worth.

Can anyone recommend a tool that will harvest urls for me. Ideally I will point it to a site and it will output a list of urls it has not seen before.

Thank you in advance.
#harvesting #tool #url
  • Profile picture of the author janeiro82
    I'm not quite sure what you are talkig about... maybe "scrapebox" could help?
    {{ DiscussionBoard.errors[5988231].message }}
    • Profile picture of the author dracoboar
      Originally Posted by janeiro82 View Post

      I'm not quite sure what you are talkig about... maybe "scrapebox" could help?

      basically I am looking for a program that will crawl a website and output(like to an excel file) urls that is has not seen before.

      For instance a few of my articles auto syndicate to a website, i then run the url harvester and it creates an excel file with all the new urls on that website, which I can then feed to my bookmarking tool.


      Thanks again
      {{ DiscussionBoard.errors[5988310].message }}
  • Profile picture of the author kingtana1
    Scrapebox can harvest all pages of a website. you will want to place the domain name like this into the keywords field: site:domain.com

    You would be harvesting the urls from a search engine, set the max connections for that engine to 5, set time out settings to 30 seconds, tick box to use proxies, set results to 1000, click Harvest.
    {{ DiscussionBoard.errors[5995468].message }}
    • Profile picture of the author yukon
      Banned
      Originally Posted by kingtana1 View Post

      Scrapebox can harvest all pages of a website. you will want to place the domain name like this into the keywords field: site:domain.com

      You would be harvesting the urls from a search engine, set the max connections for that engine to 5, set time out settings to 30 seconds, tick box to use proxies, set results to 1000, click Harvest.
      The downside of that (site:domain) is, If Google doesn't know about the pages neither will Scrapebox.
      {{ DiscussionBoard.errors[5996156].message }}
      • Profile picture of the author kingtana1
        Originally Posted by yukon View Post

        The downside of that (site:domain) is, If Google doesn't know about the pages neither will Scrapebox.
        Thank you for clarifying, this is true. I'm sure you would agree that scrapebox is a good tool.

        You can use multiple search engine's to harvest with it, to get the most out of it, it helps to learn more about the search operators for a given engine.
        {{ DiscussionBoard.errors[5996355].message }}
  • Profile picture of the author yukon
    Banned
    {{ DiscussionBoard.errors[5996175].message }}
  • Profile picture of the author JSProjects
    If the site has a xml sitemap Scrapebox can grab all of the URLs from it.
    {{ DiscussionBoard.errors[5996382].message }}

Trending Topics