What Disallow: /http/ or /https/ means?

3 replies
  • SEO
  • |
Hi Guys,

I checked some of my competitors robots.txt file, and came to know that one famous site robots.txt says

User-agent: *
Disallow: /http/
Disallow: /cgi-bin/
Disallow: /https/
Disallow: /admin/

What it means ?
Stopping search engines to crawl http or https sites means all the pages?

P.S. That website have many pages ranked on google top pages.
#disallow #means #or http or #or https or
  • Profile picture of the author yukon
    Banned
    Originally Posted by rohanjha View Post

    Hi Guys,

    I checked some of my competitors robots.txt file, and came to know that one famous site robots.txt says

    User-agent: *
    Disallow: /http/
    Disallow: /cgi-bin/
    Disallow: /https/
    Disallow: /admin/

    What it means ?
    Stopping search engines to crawl http or https sites means all the pages?

    P.S. That website have many pages ranked on google top pages.


    Those aren't blocking http or https, those are sub-folders/directories. You can tell by the forward slash to the left of the name.

    • User-agent: *
    • Disallow: /http/
    • Disallow: /cgi-bin/
    • Disallow: /https/
    • Disallow: /admin/

    That red forward slash means the folder/directory is nested one level below the domain name.
    • domain.com/http/
    • domain.com/cgi-bin/
    • domain.com/https/
    • domain.com/admin/
    {{ DiscussionBoard.errors[10477186].message }}
    • Profile picture of the author rohanjha
      Thanks man,

      But Why webmasters use these /http/ or /https/ in robots.txt
      {{ DiscussionBoard.errors[10477377].message }}
  • Profile picture of the author Richard Whyte
    Hi

    Often when you are making updates to pages or doing a redesign, you need a workspace. You can place those pages in a sub-folder while you are working on them before you move them to the live website.

    While working on them, you do NOT want the search engines to find them as they can cause you to have duplicate content issues. So you ASK the search engines NOT to visit those folders on your web server.
    {{ DiscussionBoard.errors[10478042].message }}

Trending Topics