REMOVING pages from Google index

by Filter
7 replies
  • SEO
  • |
Hopefully a tip for someone out there...

I recently re-organized one of my WP sites, changed a few categories, renamed a few pages for better SEO etc. Of course the problem I then had was a bunch of 404 errors in the SERPs and a whole bunch of crawl errors in Webmaster Tools. Not very Google friendly!

I tried adding new content, resubmitting sitemaps, getting G to re-crawl...nothing would get rid of those errors.

Then I found it (and sorry if you already knew about this, it was a new find for me).

In Webmaster Tools, find the crawl errors as usual in Diagnostics->Crawl Errors and copy every URL that is a 404.

Next, go to Site Configuration->Crawler Access ->Remove URL. Click on New Removal Request, enter a URL, select the reason why you want it removed and bingo. The request goes into a pending que then eventually gets removed from Google.

Hopefully helps someone out as it's bugged the hell out of me for weeks

Cheers
#google #index #pages #removing
  • Profile picture of the author erazer
    Nice tip, thanks. I need to re-title some pages from 4+ word titles to 3-or-less word titles and this will help a lot.
    {{ DiscussionBoard.errors[2169314].message }}
  • Profile picture of the author Filter
    You're very welcome guys, hope it helps

    Cheers
    {{ DiscussionBoard.errors[2171947].message }}
    • Profile picture of the author Filter
      Sorry guys, just an update to this one. Looks like you also have to disallow the error pages in robots.txt until Google gets around to re-indexing your site.

      The removals that I submitted have been accepted but they are still indexed and still show as crawl errors, even though Webmaster Tools shows the bot has been back to the site after I removed the pages (thanks Google!!).

      If you have ever looked for robots.txt in a wordpress install, you'll know it doesn't exist - it's a virtual file. Easiest way around this is to install "KB-ROBOTS.TXT" plugin. Then list all the pages you want removed like this:

      User-agent: *
      Disallow: http://www.mysite.com/badpage/
      Disallow: http://www.mysite.com/badpage2/
      etc etc

      And don't forget to include your sitemap as well:

      Sitemap: http://www.mysite.com/sitemap.xml.gz


      Once you see the bad pages are de-indexed (or not showing as crawl errors) you can safely remove them from robots.txt

      Cheers
      {{ DiscussionBoard.errors[2172292].message }}
      • Profile picture of the author ceweqsakti
        Originally Posted by Filter View Post

        Sorry guys, just an update to this one. Looks like you also have to disallow the error pages in robots.txt until Google gets around to re-indexing your site.

        The removals that I submitted have been accepted but they are still indexed and still show as crawl errors, even though Webmaster Tools shows the bot has been back to the site after I removed the pages (thanks Google!!).

        If you have ever looked for robots.txt in a wordpress install, you'll know it doesn't exist - it's a virtual file. Easiest way around this is to install "KB-ROBOTS.TXT" plugin. Then list all the pages you want removed like this:




        Once you see the bad pages are de-indexed (or not showing as crawl errors) you can safely remove them from robots.txt

        Cheers
        Does this method work with post ?
        {{ DiscussionBoard.errors[2569067].message }}
        • Profile picture of the author sylviaunlimited
          Thank you so much. Do you know if this might be why my site had not been re-indexed? I had a small page thrown up when the site was being built. Now it's been two weeks and my site still have not been re-indexed.

          thanks!
          Signature

          see my crazy IM journey http://www.sylviaunlimited.com

          {{ DiscussionBoard.errors[2569253].message }}
  • {{ DiscussionBoard.errors[2183568].message }}

Trending Topics