Google Sitemap, Pages not indexed?

7 replies
  • SEO
  • |
Hey Everyone,

I submitted a sitemap to google webmaster tools and it shows the following:

URLs Submitted: 24,509
URLs Indexed: 10,588

When I search in google: site:mywebsite.com it comes back with only 1,900 pages. Anyone know why this is? I know the "site:" search term shows more then that for other sites, I have seen in up to 300,000. Any insight into this? Thanks!
#google #indexed #pages #sitemap
  • Profile picture of the author Marko Vel
    When you set up site on the net?
    {{ DiscussionBoard.errors[1670296].message }}
    • Profile picture of the author DDistel
      Originally Posted by Marko.V View Post

      When you set up site on the net?

      Do you mean when did I start the website? If so the website has been up and updating for 5 years +
      {{ DiscussionBoard.errors[1670351].message }}
  • Profile picture of the author Marko Vel
    Here are some of the top reasons why your XML Sitemap numbers might be different

    (1) Image URLs in Sitemap - Google doesn't index images from Sitemaps (they say "we don't index images directly (instead, we index the page that contains the image). As a result, direct image URLs in your Sitemap won't be indexed".)

    (2) There are duplicate URLs in your Sitemap. This shouldn't happen with a good XML Sitemap generator, but it is always something that you should check for.

    (3) The data is out of date - Google describe the numbers as a "close approximation" which might not be 100% accurate. They talk about the fact that their systems are ever changing and that there might be a lag between calculation and publication.

    (4) You have pages that are undervalued by Google. This is the old Supplemental Index problem again. Undervalued pages are not visited often, and may not be indexed at all. There are many reasons for undervalued pages, and you need to raise the authority of the site/pages in order to get them indexed.

    (5) You have pages that are orphaned - that only appear in the XML Sitemap and not elsewhere in the site. This often causes pages to be undervalued, because the value of a page is at least partly defined by the number of links to a page. Although theoretically, XML Sitemap content does not have to be accessible by crawling (Google says it is a good way to provide pages accessible by Ajax that can't be crawled by Googlebot) this can still be a barrier to being indexed.

    (6) You have a crawling problem on your website. This may cause orphaned pages as above, or crawl problems like spider traps may be stopping Googlebot getting to the important parts of your site, which might cause pages to remain un-crawled that should otherwise be. To check for this, the best place to look at in Webmaster Tools is the Crawl Stats page - if there are huge peaks lasting just one day, or a number of pages listed as a "high" that very exceeds

    (7) You may have pages being hit by a duplicate content filter. If Google sees pages as too similar to each other, it may not index all the different variations of the page. This can apply to database driven (cookie cutter) pages as well as complete word for word duplications.
    {{ DiscussionBoard.errors[1672176].message }}
  • Profile picture of the author DDistel
    Thank you for your reply.
    {{ DiscussionBoard.errors[1675373].message }}
    • Profile picture of the author Larozhfuko
      Also try submitting images sitemap to Google
      {{ DiscussionBoard.errors[2192399].message }}
  • Profile picture of the author Larozhfuko
    I'd also suggest to create other types of Sitemaps for Google as well, if you have that kind of content (Video, Geo, Mobile, News..)
    {{ DiscussionBoard.errors[2192357].message }}
  • Profile picture of the author lightingguru
    How do you set up a image sitemap for google. I have never heard of that.
    {{ DiscussionBoard.errors[5280751].message }}

Trending Topics