Google ignoring robots.txt file?

2 replies
  • WEB DESIGN
  • |
For varied reasons, I run a hybrid website - a lot of dynamic pages and a few static pages. Within the static pages, there are some iframes. The iframes point to a folder that contains some product images and descriptions, and some files that contain links to urls within the dynamic part of the site. In my robots.txt file, which was written in June this year, I have disallowed access to all bots to the folder that contains the iframes. I have run a "site:mysite/iframefolder" search and the robots.txt disallowed folders and the iframe urls are showing up. Has anyone an idea about why this has happened now and not before? I thought the search engines, and particularly Google ignored crawling/indexing pages that were listed in the robots.txt file? The robots.txt file was in place long before the iframe folders were uploaded. It isn't coincidence that a significant number of my product pages have now been seen as duplicate content and I appear to have been penalized. For clarity (afterthought) there are html links within the iframe pointing to areas of the dynamic pages. Am I right to assume that because the parent folder is disallowed, then the bots won't/shouldn't crawl the child files contained within that folder? Thanks in advance for any help on this
#file #google #ignoring #robotstxt
  • Profile picture of the author copyrank
    Setup google webmaster tools for that site. It will sort out a lot of problems with regard to indexing and I think it allows you to specify what pages you want crawling and which you don't.
    Signature

    I'm an article writer.

    If you want results I recommend Jason Fladlien!

    {{ DiscussionBoard.errors[4794805].message }}
    • Profile picture of the author rslaing
      Thanks for your reply - but this site (and my others some going back to 2005) are already listed with Webmaster Tools. UPDATE - I have requested removal of the URL's and Google have removed them in what must be record time, the same day. I still don't know why they disobeyed the robots.txt file in the first place though..............
      {{ DiscussionBoard.errors[4796005].message }}

Trending Topics