25 replies
  • SEO
  • |
What is robot.txt? And how this is used?
#robotstxt
  • Profile picture of the author Kris79
    Robot.txt is a file placed in root folder of your domain.
    It has instructions for search engines bots which content on your website it should check and index.

    Using it is very helpful when you have more complex site structure and you don't want show all your content to the outside world.

    Here is more about it:
    The Web Robots Pages
    {{ DiscussionBoard.errors[5625632].message }}
  • Profile picture of the author iamnotfrankkern
    You can get detailed explanation of robot.txt at The Web Robots Pages
    {{ DiscussionBoard.errors[5625655].message }}
  • Profile picture of the author richardfranklin
    You can go to robotstxt.org or can get help of Wikipedia to get exact definition.
    Signature
    Hire a Reliable SEO Company for Your Business Growth
    {{ DiscussionBoard.errors[5625813].message }}
  • Profile picture of the author stevejhon
    Basically Robot.txt is a File which is located on your web server, the main objective of robot.txt is to allow Bots/Crawlers to Indexed your web pages. You can also restrict pages from indexing through Robot.txt File!
    {{ DiscussionBoard.errors[5625844].message }}
  • Profile picture of the author ashleysmith12
    HI
    Robots.txt is the automated software which tell search engine what to crawl what to not.
    {{ DiscussionBoard.errors[5625924].message }}
  • Profile picture of the author webgnomes
    You can also find a very similar robots.txt discussion in this thread: warriorforum.com/adsense-ppc-seo-discussion-forum/534078-robots-txt.html
    {{ DiscussionBoard.errors[5640626].message }}
  • Profile picture of the author hilarious89
    If I restrict a page for crawling Google wouldn't crawl it then can I use black hat technique onto that page?
    Signature
    My Archive :- A blog where you will get everything updated !
    Get Walkthrough Videos of Newly Released Games from Entertainment Discuss!!
    {{ DiscussionBoard.errors[5686989].message }}
  • Profile picture of the author John Conner
    Robots.txt is a .txt file, which tell search engines that what to index or not on a particular websites.

    Here is one example:

    User-agent: *
    Disallow: /
    {{ DiscussionBoard.errors[5687854].message }}
  • Profile picture of the author hilarious89
    So how can I create this robot.txt. Should I open a new txt document and just rename it to robot?
    Signature
    My Archive :- A blog where you will get everything updated !
    Get Walkthrough Videos of Newly Released Games from Entertainment Discuss!!
    {{ DiscussionBoard.errors[5700973].message }}
  • Profile picture of the author ktonline
    Open up a notepad and save the file as 'robots.txt'
    {{ DiscussionBoard.errors[5701079].message }}
  • Profile picture of the author jsmith2482
    How effective or helpful is Robots.txt for seo? There's a few warriors selling it in their seo packages saying it will improve rankings?
    {{ DiscussionBoard.errors[5701989].message }}
    • Profile picture of the author hilarious89
      Originally Posted by jsmith2482 View Post

      How effective or helpful is Robots.txt for seo? There's a few warriors selling it in their seo packages saying it will improve rankings?
      It can be created all by ourself then why buying?
      Signature
      My Archive :- A blog where you will get everything updated !
      Get Walkthrough Videos of Newly Released Games from Entertainment Discuss!!
      {{ DiscussionBoard.errors[5727010].message }}
  • Profile picture of the author Rob Ainge
    Go to Google's WebMaster Tools site (do a Google search) and add your site.

    Then click on "site configuration" on the left hand side, then "crawler access" and it will allow you to create/edit and test a robots.text file that you can then save and upload to your site all for free!

    Don't pay for this file, it's very simple to do and can be very good for controlling what the search engines do with your site and what they show in search results.
    {{ DiscussionBoard.errors[5727202].message }}
    • Profile picture of the author hilarious89
      Originally Posted by Rob Ainge View Post

      Go to Google's WebMaster Tools site (do a Google search) and add your site.

      Then click on "site configuration" on the left hand side, then "crawler access" and it will allow you to create/edit and test a robots.text file that you can then save and upload to your site all for free!

      Don't pay for this file, it's very simple to do and can be very good for controlling what the search engines do with your site and what they show in search results.
      You are so right cause paying for this type of easy work will be called rather foolishness.
      Signature
      My Archive :- A blog where you will get everything updated !
      Get Walkthrough Videos of Newly Released Games from Entertainment Discuss!!
      {{ DiscussionBoard.errors[5789420].message }}
  • Profile picture of the author monkseo
    Robots.txt is a de-facto standard, which means it is used by people following web SEO standards, but not mandatory by any search engine or web standards authority.

    In other words, it is not necessary, but good to have.

    For more info visit: The Web Robots Pages
    Signature
    {{ DiscussionBoard.errors[5789686].message }}
    • Profile picture of the author simona86
      I have my robots.txt placed in the root of subdomain i.e: 'subdomain.mywebsite.com/robots.txt'. I also have the condition as 'Disallow: / and it is not helping me at all. However, a member on another forum has informed the following:

      Instead of:
      User-agent: *
      Disallow: /

      I must have
      User-agent: *
      Disallow: /http://subdomain.mywebsite.com
      {{ DiscussionBoard.errors[5789844].message }}
  • Profile picture of the author shophia
    Robots.txt is a text file and it is used to keep out content from the crawling method of search engine spiders. Here i define use to robot.txt file.

    User agent: this factor describes, for which spider the next factors will be valid. * is a wildcard which mean all spiders or Googlebot for Google.

    Disallow: describes which folders will be prohibited. Nothing means not anything will be excluded, means the whole thing will be excluded or , folder name can be used to identify the values to prohibited.
    {{ DiscussionBoard.errors[5790115].message }}
  • Profile picture of the author northtrans
    Robots.txt placed in root folder of the domain..it allows or disallows some search engines to crawl particular part of your site..
    {{ DiscussionBoard.errors[5790135].message }}
  • Profile picture of the author johnasthlon
    Robot.txt is the command to search engine bot whether you would like to index or not index through bot tags.

    If you don't want to index then you can add dis allow: page path or else if you want to index any page then allow: page path
    {{ DiscussionBoard.errors[5791010].message }}
  • Profile picture of the author haiwasnm
    allow only google adsense bot using robots.txt

    add the following two-line code

    User-agent: Mediapartners-Google
    Disallow:
    {{ DiscussionBoard.errors[5792412].message }}
  • {{ DiscussionBoard.errors[5792436].message }}
    • Profile picture of the author hilarious89
      Originally Posted by TheeBook View Post

      Does blogspot have it?
      You can't have robot.txt file on free blogging platform such as blogger or wordpress. These free blogging platform doesn't provide you cPanel. They give you sub domains.
      Signature
      My Archive :- A blog where you will get everything updated !
      Get Walkthrough Videos of Newly Released Games from Entertainment Discuss!!
      {{ DiscussionBoard.errors[6018036].message }}
  • Profile picture of the author chriscyan
    robot txt are metatags you use to avoid search engines from crawing your site
    {{ DiscussionBoard.errors[5792533].message }}
  • Profile picture of the author onlinemegastore
    Some companies doesn't allow anybody to view their personal page or accounts pages so they put txt file on that page.you can't do that on free blogging sites.
    {{ DiscussionBoard.errors[6018701].message }}
  • Profile picture of the author heenakapoor
    robots.txt is use for inform to search engine spiders that which pages are use for crawling and which are not. Suppose you use some irrelevant pages in your website which are necessary for your website. when Google spiders will come and they get irrelevant pages on your website then it will harmful for your website. The best solution to avoid it, is use of robots.txt.
    {{ DiscussionBoard.errors[6020354].message }}

Trending Topics