Is this even possible...?

10 years ago

Originally Posted by WillR

Hi,

I have a competitors website and want to be able to check all the pages they have on their url. A lot of the pages are not indexed in the search engine (so a Google search is not possible) and the pages are also not linked to from other pages or a sitemap.

Are there any sneeky/ninja ways you can find the pages on a url/domain?

I know it's a long shot but thought I'd ask.

The problem could be within the websites code, maybe his SEO methods are off or different from normal websites
sometimes google won't pick up most pages when theres too much media content or flash involved

Thanks
1 reply

{{ DiscussionBoard.errors[9246260].message }}

WillR

10 years ago

Originally Posted by KloudStrife

The problem could be within the websites code, maybe his SEO methods are off or different from normal websites
sometimes google won't pick up most pages when theres too much media content or flash involved

There's no problem. They have pages they don't want indexed. My question is how do I find those pages.

Thanks
1 reply

Signature

.

âž¨ Here's How to Make Your First $177 Online
This Week Using the Piggyback Method...

.

{{ DiscussionBoard.errors[9246267].message }}

KloudStrife 10 years ago

Originally Posted by WillR

There's no problem. They have pages they don't want indexed. My question is how do I find those pages.

hmmm good question, perhaps looking into the actual website code through the browser for a direct link to the next page
- Thanks
- 2 replies
{{ DiscussionBoard.errors[9246272].message }}
- WillR 10 years ago
  
  Originally Posted by KloudStrife
  
  hmmm good question, perhaps looking into the actual website code through the browser for a direct link to the next page
  
  Nope, as I said, they are not linked to anywhere.
  
  It's a long shot. I don't think it's even possible but just thought I would ask as there are some smart cookies out there.
  
  Thanks
  
  Signature
  
  .
  âž¨ Here's How to Make Your First $177 Online
  This Week Using the Piggyback Method...
  
  .
  
  {{ DiscussionBoard.errors[9246280].message }}
- joseph7384 10 years ago
  
  Originally Posted by WillR
  
  There's no problem. They have pages they don't want indexed. My question is how do I find those pages.
  
  Originally Posted by KloudStrife
  
  hmmm good question, perhaps looking into the actual website code through the browser for a direct link to the next page
  
  That won't work, as I myself have off blog pages! I build my own squeeze pages and I do it in an editor and ftp it to a blog's domain.
  
  I would say it's nearly impossible to know every page if they aren't indexed. The best thing that I can tell you is to get on your competitors lists (as many lists that this guy may have)and and study the sales funnel to see what pages that he sends you to.
  
  Thanks
  
  Signature
  
  Discover The One Secret That They Are Not Telling You!
  
  {{ DiscussionBoard.errors[9246281].message }}

beasty513

10 years ago

Originally Posted by WillR

Hi,

I have a competitors website and want to be able to check all the pages they have on their url. A lot of the pages are not indexed in the search engine (so a Google search is not possible) and the pages are also not linked to from other pages or a sitemap.

Are there any sneeky/ninja ways you can find the pages on a url/domain?

I know it's a long shot but thought I'd ask.

Could be likely that he put a robot.txt in the script to make the search engine's

spider ignore the site, therefore it doesn't get indexed.

You can go to Ahrefs or majestic seo, enter the main domain and see what backlink's

he has been building.

Thanks
1 reply

Signature

Create Stunning Professional Animation with A.I. Cartoonz
CLICK HERE!

{{ DiscussionBoard.errors[9246278].message }}

WillR

10 years ago

Originally Posted by beasty513

Could be likely that he put a robot.txt in the script to make the search engine's

spider ignore the site, therefore it doesn't get indexed.

You can go to Ahrefs or majestic seo, enter the main domain and see what backlink's

he has been building.

The pages are not indexed on purpose. They are not linked or indexed anywhere. That's my point.

Thanks

Signature

.

âž¨ Here's How to Make Your First $177 Online
This Week Using the Piggyback Method...

.

{{ DiscussionBoard.errors[9246290].message }}

savidge4

10 years ago

Will,

the SNEEKY way? open the robot.txt file and see what they are hiding! DUH

Originally Posted by WillR

Hi,

I have a competitors website and want to be able to check all the pages they have on their url. A lot of the pages are not indexed in the search engine (so a Google search is not possible) and the pages are also not linked to from other pages or a sitemap.

Are there any sneeky/ninja ways you can find the pages on a url/domain?

I know it's a long shot but thought I'd ask.

[ 2 ] Thanks
1 reply

Signature

Success is an ACT not an idea

{{ DiscussionBoard.errors[9246364].message }}

Dennis Gaskill

10 years ago

Originally Posted by savidge4

Will,

the SNEEKY way? open the robot.txt file and see what they are hiding! DUH

If they are trying to hide them it's unlikely they'd put them in their robots file for anyone to see, but you never know, I guess. I never put anything I'm trying to hide in my robots file. That's like advertising them. Plus there's no need to if there are no links to them.

Thanks
1 reply

Signature

Just when you think you've got it all figured out, someone changes the rules.

{{ DiscussionBoard.errors[9246428].message }}

savidge4

10 years ago

Ah but the OP is asking to find links that are not in the SERPS and don't have page links. It is VERY possible the site he basically wants to hack is USING the pages, but simply has them marked no follow. We all know that doing so for the most part on a page level is fruitless, so in most cases that would be done either in the robots.txt file OR there stands the possibility of some server side blocking in the .htaccess file

Originally Posted by Dennis Gaskill

If they are trying to hide them it's unlikely they'd put them in their robots file for anyone to see, but you never know, I guess. I never put anything I'm trying to hide in my robots file. That's like advertising them. Plus there's no need to if there are no links to them.

Thanks

Signature

Success is an ACT not an idea

{{ DiscussionBoard.errors[9246633].message }}

yukon

Banned 10 years ago

I know you already said the pages aren't linked to each other but I would still run Screaming Frog to see what pops up. SF will show you 500 pages/URLs for free on the trial version.

[ 1 ] Thanks

{{ DiscussionBoard.errors[9246439].message }}

tpw

10 years ago

You can get a php spider tool and simply tell your spider to ignore robots.txt.

The php spider will crawl the site the same way google does, by looking at the directory tree.

[ 2 ] Thanks
1 reply

Signature

Bill Platt, Oklahoma USA, PlattPublishing.com
Publish Coloring Books for Profit (WSOTD 7-30-2015)

{{ DiscussionBoard.errors[9246442].message }}

Doug9

10 years ago

Originally Posted by tpw

The php spider will crawl the site the same way google does, by looking at the directory tree.

I don't think Google or anything else crawls sites by looking at the directory tree.
The only way to get to a page on a web server is to supply an exact URL. This is usually provided by a link on the page that a user or Google can see.

There is no way to see what directories or files are on the server unless this is purposely done by the owner.

Web servers are specifically built so pages are not knowable unless the owner wants them to be known.

[ 2 ] Thanks
1 reply

{{ DiscussionBoard.errors[9248208].message }}

yukon

Banned 10 years ago

Originally Posted by Doug9

I don't think Google or anything else crawls sites by looking at the directory tree.
The only way to get to a page on a web server is to supply an exact URL. This is usually provided by a link on the page that a user or Google can see.

There is no way to see what directories or files are on the server unless this is purposely done by the owner.

Web servers are specifically built so pages are not knowable unless the owner wants them to be known.

Especially considering a large percentage of the web is dynamic CMS pages built on the fly..

Thanks

{{ DiscussionBoard.errors[9248228].message }}

Is this even possible...?

Trending Topics

How did you turn it all around?

A healthy fascination for Hollywood...

Any thoughts on golf's latest phenom...Scottie Scheffler??

What's the Best IM-Related Skill to Learn Today That Can Help in the Future?

best high quality crypto traffic.