Can I disallow https pages from robots.txt

6 replies
  • SEO
  • |
Hi,
I tried searching the web but could not find an answer to it.
The problem I am facing is that my site opens with https:// for some reason I dont know why which it should not.
And google has indexed some author pages of the wordpress site starting with https://

How do I block https page from being indexed
will the code
Disallow:https://
work in robots.txt
#disallow #https #pages #robotstxt
  • Profile picture of the author sarged2
    Originally Posted by nancyfromafrica View Post

    Hi,
    I tried searching the web but could not find an answer to it.
    The problem I am facing is that my site opens with https:// for some reason I dont know why which it should not.
    And google has indexed some author pages of the wordpress site starting with https://

    How do I block https page from being indexed
    will the code
    Disallow:https://
    work in robots.txt
    Do you have a cert? If so why don't you simply use mod_rewrite for your https requests?
    {{ DiscussionBoard.errors[7828809].message }}
    • Profile picture of the author nancyfromafrica
      Thanks for the reply sarged2 we have brought ssl services but we are not implementing it on any of the pages yet, it is just a service site. We are not selling anything so ssl pages are not needed.

      should i redirect https pages to http ?
      Signature

      {{ DiscussionBoard.errors[7828863].message }}
      • Profile picture of the author sarged2
        Originally Posted by nancyfromafrica View Post

        Thanks for the reply sarged2 we have brought ssl services but we are not implementing it on any of the pages yet, it is just a service site. We are not selling anything so ssl pages are not needed.
        I see, you should do a mod rewrite then (assuming you have apache), to redirect all the HTTPS request to their corresponding http.

        Just edit your .htaccess, something like this:
        Code:
        RewriteCond %{HTTPS} on
        RewriteRule ^(.*)$ http://yoursite.com/$1 [R=301]
        {{ DiscussionBoard.errors[7828883].message }}
  • Profile picture of the author sarged2
    Basically yes, each page should have only one connection method.
    In other words, all the pages should be either https or http. And if someone makes a mistake within a protocol, the apache would automatically redirect the page using the right protocol. If you don't set these rules you allow for the ambiguity which basically leaves it up to search engines and users.
    For example, you could use https for your pages that need to be secured, and http for the common pages that do not need any encryption, like homepage, etc.
    {{ DiscussionBoard.errors[7828897].message }}
  • Profile picture of the author nancyfromafrica
    Thank you sarged2 I also read some where that I can do this by making some changes in the httpd.conf file, but I am not sure which line to edit, do you have any idea about it.
    Thanks for the htaccess code
    Signature

    {{ DiscussionBoard.errors[7829282].message }}
  • Profile picture of the author jaisonjohn
    I think you might have moved all the files of http in to https folder,so then remove all the files and folders from it which enables at the same time, place the below code in .htaccess file:

    RewriteCond %{HTTPS} on
    RewriteRule ^(.*)$ http://yoursite.com/$1 [R=301]
    {{ DiscussionBoard.errors[7829433].message }}

Trending Topics