Google is fascinated with my site but hasn't indexed any pages yet except the static ones

23 replies
  • SEO
  • |
Haven't been around here for a while, been out of the SEO game for a bit. Have recently made a site that, hmm, curates product data (kind of like price comparison - trying to do something useful with the data, not just create spammy scraped product content). I have quite a large number of these kinds of page (hundreds of thousands, but I could create millions if I wanted). The site gets referred traffic, and users find it useful, staying on the site and searching for several minutes at a time.

Google has been crawling my sitemaps (especially my sitemaps!), but also content pages for months now, and it's got even more intense lately, crawling hundreds of pages every day. Yet it has still steadfastly refused to index a single one!

Honestly, I don't want SEO advice (if you have 5 posts in your history and you are going to write "write blog comments", please don't bother), I know the difficulty in getting product pages indexed and ranking, I will get there in the end, I am working on the site in the meantime.

I am just wondering at this behaviour on the part of Google, crawling my site like crazy yet not indexing anything - anyone seen that before? I feel like blocking the bot, lol. Don't waste my resources if you aren't going to index me, you know?

EDIT: just realised I posted a similar question months ago - the situation hasn't changed since then in terms of indexing, only that Google has stepped up crawling big-time. Very odd...
#fascinated #google #indexed #pages #site #static
Avatar of Unregistered
  • Profile picture of the author yukon
    Banned
    • Are you blocking Google with a noindex?
    • Are the pages cached?
    • Are the pages buried in Supplemental SERPs?
    {{ DiscussionBoard.errors[11188893].message }}
    • Profile picture of the author markowe
      Originally Posted by yukon View Post

      • Are you blocking Google with a noindex?
      • Are the pages cached?
      • Are the pages buried in Supplemental SERPs?
      Scared me for a minute, I have a

      User-agent: *
      Disallow:
      at the beginning of my robots.txt file. That should mean NO pages are disallowed, right? Anyway, Google is actually VISITING the pages, but no, there is no NOINDEX directive on those pages, either in the http headers or in the meta tags. Plus when I manually fetch pages (Fetch as Google) it fetches and renders them fine with no issues reported... Wonder if there is anywhere else a noindex could creep in?

      Pages are KINDA cached - load times used to be really long, but now I am ajax loading some of the slower stuff (Google stepped up crawling as a result). Even the dynamically-loaded stuff IS cached, but data needs to be refreshed every hour or so, so that doesn't really help speed up Google fetch times that much. I am looking at caching more stuff that doesn't need to be refreshed so often, so there is more to see before the ajax stuff pops up. I definitely do think page load times could be an issue. I have to fetch a lot of stuff from 3rd party web services...

      Nothing in supplemental serps - site:www.mysite.com yields nothing except 6 or 7 static pages.

      I am sort of not TOO concerned, I feel like I will get there in the end, the interest from the crawler suggests maybe it's taking time to gain trust etc. I don't think Google is in the business of handing people free serps for product pages these days, considering all the past abuse with thin sites, BANS, AOM and all the rest of them.
      Signature

      Who says you can't earn money as an eBay affiliate any more? My stats say otherwise

      {{ DiscussionBoard.errors[11188909].message }}
  • Profile picture of the author yukon
    Banned
    Ajax pages?

    What's the cache (text version) look like? Is it thousands of blank pages with just a static header/sidebar/footer? No content?
    {{ DiscussionBoard.errors[11188919].message }}
    • Profile picture of the author markowe
      Originally Posted by yukon View Post

      Ajax pages?

      What's the cache (text version) look like? Is it thousands of blank pages with just a static header/sidebar/footer? No content?
      Thanks for the feedback - you are one of the handful on this forum whom I can actually trust for a reasoned opinion! What's your experience with these things, are these loading times becoming a big issue? They weren't so much last time I built a serious site.

      In a nutshell , yes, my pages are effectively blank until the async content loads. What happened was the flat html version was taking way too long to load in some cases (like I say, it does get cached for up to an hour, but has to be refreshed after that, and refreshes are triggered by visitors, I can't precache hundreds of thousands of pages every hour).

      So I started dynamically loading the main content, but you are right, the page is almost blank (header, sidebar etc) until the ajax content loads. This is only a recent development because like I say, prior to that change the whole content was being served immediately, and that change DID lead to a big increase in Google's crawl rate. But yes, I am going to work on serving up more static content for the bot to munch on before the dynamic stuff loads. It's just a big load on my hosting (a big database and/or a LOT of flat files to be stored), so I have to consider dedicated hosting before I have even got a single page indexed, which is obviously an investment.

      I realise Google isn't stupid, and I can't just serve up a page header and say, "there you go, it's loaded, oh yeah, just wait another 5 seconds for the rest". I just wonder at what point it's going to go, OK, let's index a few pages and just see how they perform, you know, cut me a break already, Google!
      Signature

      Who says you can't earn money as an eBay affiliate any more? My stats say otherwise

      {{ DiscussionBoard.errors[11188930].message }}
  • Profile picture of the author markowe
    P.S. Google SEES the ajax content once it is rendered, I tested that with the Fetch as Google tool, it's not just seeing a bunch of blank pages.
    Signature

    Who says you can't earn money as an eBay affiliate any more? My stats say otherwise

    {{ DiscussionBoard.errors[11189134].message }}
    • Profile picture of the author yukon
      Banned
      Originally Posted by markowe View Post

      P.S. Google SEES the ajax content once it is rendered, I tested that with the Fetch as Google tool, it's not just seeing a bunch of blank pages.
      I'd still go with what Google is caching (text version) before the Fetch as Google tool, odds are the Fetch tool has a slower countdown timer than the real Google bot before it bails from the webpage/s.

      You can also look at the Google cache HTML source code and that will tell you If Google is finding the content.

      Really it's hard to help without seeing some problem URLs.
      {{ DiscussionBoard.errors[11189384].message }}
  • Profile picture of the author paulgl
    You say you have "hundreds of thousands" of these pages.

    Hundreds of thousands of pages, but none worth indexing.

    Seriously. How many of those hundreds of thousands of pages are worth seeing?

    Why not create millions and see what happens?

    I can imagine google seeing hundreds of thousands of pages, probably tons of repeated shtuff, found elsewhere, etc. and just says, meh.

    Paul
    Signature

    If you were disappointed in your results today, lower your standards tomorrow.

    {{ DiscussionBoard.errors[11189405].message }}
    • Profile picture of the author markowe
      Originally Posted by paulgl View Post

      I can imagine google seeing hundreds of thousands of pages, probably tons of repeated shtuff, found elsewhere, etc. and just says, meh.

      Paul
      I expected no other comment from you, and I quite agree that it takes Google some convincing that what I have is worth indexing (want me to link you some sites with thousands of worthless pages indexed? You know there's more to it than that).

      My main issue is why the hell is Google crawling hundreds of pages every day? It makes no sense if it finds them worthless. I just find it odd behaviour. If it wasn't crawling them at all then I would know where the problem probably lay.

      Yukon, if only it would index one single solitary page, then I could actually see what the bot is seeing...

      Well, I'll report back if anything happens, I am interested in the experiment as much as anything.
      Signature

      Who says you can't earn money as an eBay affiliate any more? My stats say otherwise

      {{ DiscussionBoard.errors[11189421].message }}
    • Profile picture of the author yukon
      Banned
      Originally Posted by paulgl View Post

      You say you have "hundreds of thousands" of these pages.

      Hundreds of thousands of pages, but none worth indexing.

      Seriously. How many of those hundreds of thousands of pages are worth seeing?

      Why not create millions and see what happens?

      I can imagine google seeing hundreds of thousands of pages, probably tons of repeated shtuff, found elsewhere, etc. and just says, meh.

      Paul



      I haven't seen the pages but doubt it's that dramatic.

      Odds are the ajax is slower loading than the Google bot timeout when indexing OPs pages.

      Again, the proof is on the cache (text version) page/s and cache HTML source code. Either the content exist on the cache or it doesn't. No guessing.
      {{ DiscussionBoard.errors[11189423].message }}
      • Profile picture of the author markowe
        Originally Posted by yukon View Post

        I haven't seen the pages but doubt it's that dramatic.

        Odds are the ajax is slower loading than the Google bot timeout when indexing OPs pages.

        Again, the proof is on the cache (text version) page/s and cache HTML source code. Either the content exist on the cache or it doesn't. No guessing.
        Page load time is pretty instantaneous once a server response is received, the ajax content takes a further 3-5 seconds to load (where there isn''t an existing cache for that content - when there is it's a lot quicker). Would be interesting to know what the crawler's timeout is.

        My feeling is that it doesn't actually time out that quickly, it will fetch the content, but it's a negative ranking/indexing factor, as in, it doesn't like to index slow pages. Will have to work on improving these load times in any case.
        Signature

        Who says you can't earn money as an eBay affiliate any more? My stats say otherwise

        {{ DiscussionBoard.errors[11189675].message }}
  • Profile picture of the author pauloadaoag
    Administrator
    It would help if you can give us your website's url
    Signature
    {{ DiscussionBoard.errors[11189693].message }}
  • Profile picture of the author hnrindani
    1. Check for robots and if the pages are given no index.
    2. Check if Google has not penalized the site.
    3. Submit the site again through web masters.
    {{ DiscussionBoard.errors[11189721].message }}
  • Profile picture of the author ken607
    Banned
    [DELETED]
    {{ DiscussionBoard.errors[11189734].message }}
  • Profile picture of the author maxsi
    How many pages you have?

    do you use organic SEO + linking?
    {{ DiscussionBoard.errors[11189839].message }}
  • Profile picture of the author Matthew Payne
    Google has been taking longer to index websites. My wait time has been 2-3 months lately. However, your hundreds of thousands of pages is interesting. Like a previous person said is it worth indexing?
    {{ DiscussionBoard.errors[11190126].message }}
    • Profile picture of the author yukon
      Banned
      Originally Posted by Matthew Payne View Post

      Google has been taking longer to index websites. My wait time has been 2-3 months lately.


      You can't be serious.

      No way on 2-3 months just to index.

      I can index a default Wordpress install in 24 hours without even trying.
      {{ DiscussionBoard.errors[11190137].message }}
      • Profile picture of the author Matthew Payne
        It depends on the keywords. Maybe index isn't the right term. For less competitive keywords I can get it to show up almost the same day I put the website up. For competitive keywords, the website might be indexed, but it can still be a while before it actually shows up in the search results (as in a month or two or longer). That only started happening to me this year.
        {{ DiscussionBoard.errors[11190516].message }}
        • Profile picture of the author markowe
          Originally Posted by Matthew Payne View Post

          It depends on the keywords. Maybe index isn't the right term. For less competitive keywords I can get it to show up almost the same day I put the website up. For competitive keywords, the website might be indexed, but it can still be a while before it actually shows up in the search results (as in a month or two or longer). That only started happening to me this year.
          Interesting you should say that, from the behaviour of the crawler I get the impression the way Google assesses sites has changed a lot in the last few years.

          There are the load times that we've mentioned above that it SEEMS to be feeling out.

          I also get the bot crawling pages in huge flurries, like a few hundred at a time in the space of a few minutes and then almost nothing for hours, which I don't remember it doing before, it's almost like it's stress-testing, I actually had to limit crawl rate because of that, I didn't really mean the site for high volume traffic.

          Then there is the way it's CONSTANTLY downloading sitemaps as well, it must be downloading the same ones multiple times in the space of a few days/weeks, despite almost nothing ever changing in my sitemaps. I really can't figure out why it would continue to do that when their content isn't changing.

          There's certainly a lot we don't know about how the whole thing works, good job by Google for managing to keep this obscure!

          What I can say with some certainty is that crawl rates directly correspond with page load speeds - see my chart here:



          You maybe can't see it so well, but average page load times (green) dropped from 6-7 second to 3-4 seconds recently (due to caching I added), and the bot directly responded to that by increasing crawl rate (blue) from maybe a hundred pages a day to more like 700-1000.
          Signature

          Who says you can't earn money as an eBay affiliate any more? My stats say otherwise

          {{ DiscussionBoard.errors[11190595].message }}
  • Profile picture of the author chrisniel
    It depends on robots.txt.
    Please do cross check if you are using any sort of SEO plugin in your WordPress and it has changed the permission to robots.txt
    {{ DiscussionBoard.errors[11190805].message }}
  • Profile picture of the author markowe
    Update - well finally, Google started indexing my product pages,
    after, what 6 months or more? You've just got to be persistent... Oh, and the big jump happened the exact day I switched to https - make of that what you will (probably coincidence, but still)



    Oh yeah, and I moved to VPS, optimised a lot of MySQL queries and absolutely slashed load times:



    This stuff matters nowadays, there's no two ways about it.
    Signature

    Who says you can't earn money as an eBay affiliate any more? My stats say otherwise

    {{ DiscussionBoard.errors[11249552].message }}
    • Profile picture of the author pauloadaoag
      Administrator
      can you share your crawl rate plots as well
      Signature
      {{ DiscussionBoard.errors[11249778].message }}
      • Profile picture of the author markowe
        Sure, here is the whole thing - as you can see Google has been crawling in the thousands of pages on a daily basis, that was the weird thing about it taking so long to start indexing:

        Signature

        Who says you can't earn money as an eBay affiliate any more? My stats say otherwise

        {{ DiscussionBoard.errors[11249886].message }}
  • Profile picture of the author markowe
    Here we go, just for the doubters, there's been another huge spike in indexed pages. I post this for the doubters (the usual miserable gits on this forum!) who said you couldn't get hundreds of thousands of pages indexed. Well of course we're nowhere near that yet, but my problem until a month ago was I couldn't get a single page indexed, so I think the situation is changing somewhat. And of course it will take time to get decent ranking for those pages.



    So, sure you can get large numbers of pages indexed, but there definitely seems to be a period (in the order of 6 months) where you have to build trust, and the Google bots will apparently happily crawl ten thousand of your pages a day without indexing anything.
    Signature

    Who says you can't earn money as an eBay affiliate any more? My stats say otherwise

    {{ DiscussionBoard.errors[11252668].message }}
  • Profile picture of the author markowe
    OK, I think I will probably post on this thread for the last time, but just to update you. So I started this site back in spring 2017, gradually creating hundreds of thousands of pages of - not original but USEFUL purchase-related content (think statistical data, usefully interlinked data, custom-generated graphics etc.).

    And for about 6 months Google literally only indexed my 10-20 blog posts and ignored those product-related (i.e. money) pages. Since the last post when G was crawling lots of pages, nothing much happened until around Christmas time, when suddenly Google's crawl-rate went through the roof:



    It was panic stations for about a week, as I had never really planned for a high (at one point) of nearly half a MILLION pages crawled per day, which made my previous "high" of 16,000 pages look like just a blip on the graph! I discovered Google often crawls at quite consistent rates: e.g. once every 2 seconds, 2 x p/s, 4 x p/s, even for about half a day 10 pages per second! I had no idea Wordpress could cope with that kind of traffic but my site just about held up (after I temporarily got rid of some site features with slow queries etc.) - my site response times actually dropped quite a lot during that time as you can see above, due to my hasty optimisations.

    Oh yes, and you might be interested to learn that Google counts sitemap crawls and ajax content crawls in its "pages crawled" count. And also that ajax pages are crawled quite independently of the main content, and are not always (in fact rarely) fetched along with the corresponding static content. Google crawls that dynamic stuff on its own schedule (more slowly too, as that kind of content is often slower to serve by nature).

    So now I have around 500,000 pages indexed, which, without any real SEO etc. translates to about 300 Google search visits per day. Not that much maybe, but more than many sites I have owned in the past. Now I can actually get on with optimising the site and promoting it to improve rankings for individual pages.

    I just want to say the moral of the story is you CAN get a huge number of pages indexed on a new site, but you need to be ready to take TIME to gain Google's trust and put in a lot of work making your site actually useful and not just spam. More than 6 months in my case. Oh, and for the love of God buy up your domain several years in advance, get an SSL certificate and get dedicated, or at least VPS hosting.

    I could write some tips on using freely available "big data" to generate your own very large sites like mine but I think I have said enough. Suffice to say you need a programmer for that kind of job but it can be done, with some imagination. Thanks for all the shared tips, especially Yukon's hints about making sure my caching was on point - that and page load times is a big one. As you can see I was originally talking about page load times of 3-7 seconds (?!?!), now we are talking less than 200ms - that's a lot of work optimising MySQL queries but boy did it pay off.

    Anyway, that's enough for me, gotta go and actually try to make some money from this now! Over and out!
    Signature

    Who says you can't earn money as an eBay affiliate any more? My stats say otherwise

    {{ DiscussionBoard.errors[11283984].message }}
Avatar of Unregistered

Trending Topics