find ALL links with errors

by monere
4 replies
Hi everyone,

first of all I apologize if I am posting this in the wrong sub-forum but.... it made the most sense to me to post it here. The other option would have been the "programming" sub-forum but that means I would have received a geeky answer that I wouldn't have known what to make of... probably

So anyway.... I would like to ask you if there is a fast and easy way to find all of the URLs that return 404 and 504 errors of our company's website.

I am in charge of my company's Adwords account and for over an year I was able to get away with the approval of the ads by calling the adwords support and kindly asking them to approve the ads manually as there is nothing wrong with the URLs that those ads are pointing to. As I said, this worked for more than an year and the support staff would understand and approve the ads without any problem. Well, not anymore. They said that we were already getting away with this trick for long enough and that they won't approve our ads anymore until we fix the issues that we have with our site. The problem is.... while I do check the GWT account every now and then and add the faulty URLs I find in the GWT account to the robots.txt file it seems that new buggy URLs get discovered each day on infinity. And if even 1 small URL returns a 404 or 504 error (as per adwords support saying) then their system will automatically disapprove our ads.

So... do you happen to know of a way/method/script/whatever that is (preferably free, obviously) fast and which can detect all existent, faulty links on our site, as well as alert us each time new links get discovered so we can fix them. Even if there is no simple way to do all these, something that even solves this problem partially is more welcome.

Also, our website is on DotNetNuke platform (quite an old version, too, 4.5 or 4.6 IIRC). I don't know if this info helps, but I thought I should mention it anyway. Our company sells industrial IT and, naturally, we have an ecommerce website.

If I left out any info that you feel might help you suggest a solution to my problem ask! Otherwise I am looking forward to hearing your suggestions.

Thanks!
#errors #find #links
  • Profile picture of the author mirko76
    To find the broken links you could run a web scraper/crawler on your site regularly. I would use Scrapy because I know Python, but there are several alternatives.
    These are the symptoms of the underlying problem. I do not know DotNetNuke so I cannot help you there.
    Hope that helps
    {{ DiscussionBoard.errors[10005529].message }}
    • Profile picture of the author monere
      I'll check that out. Thanks

      Originally Posted by mirko76 View Post

      To find the broken links you could run a web scraper/crawler on your site regularly. I would use Scrapy because I know Python, but there are several alternatives.
      These are the symptoms of the underlying problem. I do not know DotNetNuke so I cannot help you there.
      Hope that helps
      Signature

      Try not to become a man of success but rather to become a man of value - Albert Einstein

      {{ DiscussionBoard.errors[10005569].message }}
  • Profile picture of the author nmwf
    Xenu is an offline option. It's for Windows and it's free: Find broken links on your site with Xenu's Link Sleuth (TM)
    Signature
    Write comprehensible articles on *any* topic in seconds with First Draft...
    First Draft's: Download | Add-Ons | Templates | Purchase | Support | Affiliates
    {{ DiscussionBoard.errors[10005548].message }}

Trending Topics