Site crawler tool to define internal linking structure?

by 8 comments
OK, either I'm having a mental block remembering a tool like this, or this is the first time I've ever needed this tool.

I'm inheriting support/development responsibilities for a very large, very poorly designed, 5+ year old, PHP website, with a ton of broken links (dumb move, I know!).

Is there a tool out there that crawls the site, and defines all of the internal page links? e.g. What pages link to what pages? This site is a mess and I can't see trying to manually research/define how to navigate to hundreds of pages.

Something that creates a graphical site structure would be awesome, but a text-based report that is searchable would work to.

Mark
#web design #crawler #define #internal #linking #site #structure #tool

  • Profile picture of the author MarkR
    So, did I stump Warriors? Is there nothing like this?

    Xenu comes kind of close, but doesn't show all links off of a page.

    I don't think I'm the only person that ever needed this functionality, am I?
  • Profile picture of the author davidmerrick
    I need something like this too bro. You could try the one I'm using called OutWit Hub. It's a Firefox Plugin that crawls sites and exports to .csv. You can find it here and it's free: Harvest The Web - OutWit Technologies.

    What I'm looking for is a crawler that will follow links and pull product titles. Do you know of anything like that?
  • Profile picture of the author signupmakemoney
    Google Webmaster Tools will tell you all the broken links there are in the site. You just have to verify that you own the site first.
  • Profile picture of the author MarkR
    David,

    Thanks. Outwit seems to only spider the links off the home page, one link deep, unless I'm using it wrong. I want to spider the home page, then the links the pages the home page linked to, then those page's links and so on.

    signupmakemoney,

    I'm not looking for just broken links (xenu does that well), I'm looking for a tool that 1) spiders the whole site, and 2) tells me what pages link to what pages, and 3) puts it in a graphical or text-based report that I can interrogate. So, when I need to find out what pages link to pagename123.html, I can find that information.

    This seems like a logical request, but nothing turns up in searches.
  • Profile picture of the author DAN11
    I need this as well...

    Have you tried downloading the whole site to dreamweaver or front page and looking from these softwares?
  • Profile picture of the author mojojuju
    Here's how you can do it with Xenu and Graphviz How to Visualize Link Flow Within a Website | WordStream
  • Profile picture of the author MarkR
    Thanks for that. It seems to be what I need. I wonder how it looks and how long it takes with a site of 4k-6k pages? LOL
  • Profile picture of the author DAN11
    Hi,
    Thanks for that but I don't see where in the Xenu I can export as the requested file format?

Next Topics on Trending Feed