Best Way to Fully Index a Website on Google

19 replies
  • SEO
  • |
The social network that I run has been written about twice in WSJ.com, and it has over hundreds of backlinks. However, it still is not completely indexed.

We get a couple of million monthly page views from our user base through word of mouth, but we get very little quality traffic from google.

The domain name is new. 6 months old.

Is there a good way to get it fully indexed? 11,000 pages out of 90,000 are indexed.

Should I be trying to get more high PR links, or is it just a waiting game until all of the 100s of backlinks from member badges are picked up by Google?

FYI, I have a pretty good in house SEO that is stumped, so I am not just starting from scratch here.

Any and all suggestions are highly appreciated.

Feel free to list SEO services that I can review or any other pay services that might help get our network fully indexed.
#fully #google #index #indexing #search #seo #website
  • Profile picture of the author Michael Silvester
    Hi Natalie,

    Great question...

    I understand that you must have a massive amount
    of pages to be indexed on your website.

    In my opinion the best way to get totally indexed is
    to create a sitemap for your website and get that
    indexed by the search engines.

    Also another tactic would to be to deep link your site. For
    example...linking to some of your internal pages rather
    than you index page.

    Hope this helps...

    Take Care,

    Michael Silvester
    {{ DiscussionBoard.errors[292119].message }}
  • Profile picture of the author CEO_Natalie
    We have a pretty strong deep linking stategy.

    Members post links on facebook, linkedin, blogs, etc. all pointing to their profile page.

    Thanks for the well thought out response, Michael.
    {{ DiscussionBoard.errors[292126].message }}
    • Profile picture of the author Fender85
      Given the large scale of content on your site, this is likely the last thing you want to hear, but have you guys thought that the lack of unique titles might have something to do with the indexing (or lack thereof)? I just noticed that a lot of the titles I saw were very similar and on occasion I've seen that be enough to keep Google from indexing a page.

      I mean stuff like this:

      Groups - Sta.rtUp.Biz - The Small Business Social Network
      Videos - Sta.rtUp.Biz - The Small Business Social Network
      Classifieds - Sta.rtUp.Biz - The Small Business Social Network

      Now all of those pages are indexed, but Google is a bit unpredictable. I've seen them pick stuff up like that no problems, and then other times I find indexing issues and so I rename the titles to make them REALLY unique and then everything's cool.

      Also, one thing I noticed, and you guys don't have a big problem with it at all but this would be an easy future-proof scenario is the indexing of secured (https) versions of your website's pages. I have a more in-depth post about this on my blog (including a simple .htaccess script to fix it by always serving the unsecured version to Googlebot) - Ecommerce - Watch out for this form of duplicate content

      Oh, and this will prevent the "invalid SSL" problem mentioned above. You see, if Googlebot were trying to go to the http version of your forum's URL, instead of the https version, it wouldn't come across that error. So in fact, taking care of the issue of serving secured pages to Googlebot might actually have a greater effect on fixing your crawlability issue than just the usual dup content issue that it presents, given that your forum area seems to have this issue with its SSL certificate.

      Do you guys happen to know where these missing 80k pages are? I think it'd probably be a lot more helpful to know if it's certain sections of the site that are having indexing issues or what, and then at that point it'd be a lot easier to settle in on why they're not being indexed.

      Anyway, definitely take care of this SSL issue and implement the .htaccess script that I recommend on that blog post. I think that given the unique nature of this SSL certificate issue, it might be especially useful. You can probably hold off on the title issue for a while - at least until we know what area(s) of the site is/are having trouble being indexed, or if this SSL fix doesn't help clear things up.
      {{ DiscussionBoard.errors[292189].message }}
      • Profile picture of the author countonuspr
        Absolutely tremendous advice above! I couldn't agree more. I noticed the title tag issue as well, and that was another post that I was going to write, but there is no need to now that it was brought up. Great insight, and I certainly feel that you should also do what the post above says. There is a lot that can be done for your website that will help you get more pages indexed. Keep us posted on your progress, and let us know if you have further questions.
        {{ DiscussionBoard.errors[292228].message }}
  • Profile picture of the author Debbie Songster
    Google will only spider your site x number of levels deep. With that many pages there is a very good chance that you have pages buried too deep.

    A site map is your best bet.

    Try Google tools - https://www.google.com/webmasters/to.../en/about.html

    Hope that helps
    Signature

    Getting back in the grove after taking a year off following a family tragedy.

    {{ DiscussionBoard.errors[292131].message }}
  • Profile picture of the author Joseph Then
    Sitemap is the best solution.

    Create a sitemap and has ALL your links on the website, link to the sitemap from your homepage, Google will crawl.

    A faster way is: Go to Google Webmaster tool and submit your sitemap. That is 100% workable.
    {{ DiscussionBoard.errors[292134].message }}
  • Profile picture of the author countonuspr
    I am happy to help you here. How many pages of content should the search engines be indexing? When I do a site search on Google I find 10,900 pages of content indexed within their search engine. Here is the search that I did:

    site:sta.rtup.biz - Google Search

    I am sure your in-house SEO has already done this. So it would help me better assess the problem if I knew how many pages of content you should have indexed.

    When I do the same site search for your forum I only see 848 pages indexed.

    site:sta.rtup.biz/forum - Google Search

    I am sure your forum has more than 848 posts on it. Are you looking for more posts to be indexed. When I clicked on your forum it took me to a page that said your SSL was invalid. This could be preventing the search engine spiders from going into your forum and picking up all of the links. What you should do is go to a computer that has not accessed your site recently or clear your cookies and cache and go to your forum and see if you get the secure certificate error. This is something that can be stopping the spiders from going further.

    The other issue that I am finding is that you need an RSS feed that actually picks up all of your content. Your RSS feed is only picking up a small portion of your content, and the content it is picking up is not quality content.

    Further, you need a SiteMap with text based links. The search engines love text-based links and they can pick those up much easier than image links. You should have a SiteMap that at least lists all of your main pages and categories.

    One other important thing is that your top navigation should also be on the footer of your website. Your top navigation should be in text-based form, and I am not talking about all of the drop-downs, but just the main menus. Your top navigation is setup as image links and again the search engine spiders have a hard time picking up image links.

    So, adding a Sitemap, adding your top navigation to the footer in text formatted links, and also fixing your RSS feed so it picks up more content is the most important thing. Once you fix your RSS feed you should go to Ping-o-Matic! and also you should submit your site to RSS Feed Directories using that direct URL.

    These simple fixes will help the search engines pick up your content. As you add new categories use a site like Digg.com to bookmark your link. This will get the search engine spiders to your page fast.

    I hope that this helps some. I am going to be travelling today, but feel free to send me a PM with more questions! Take care!
    {{ DiscussionBoard.errors[292143].message }}
    • Profile picture of the author countonuspr
      My post was so long the others snuck in ahead of me! LOL! They are right on that creating a Sitemap is important. Also, read my post above thoroughly because I mention how you should also really get your RSS feed fixed so it picks up your content. Your site is so big that you may need an RSS feed for your forum, an RSS feed for your articles, and an RSS feed for your groups. Amazon.com is a good company to copy because they have RSS Feeds for almost every category on their website.
      {{ DiscussionBoard.errors[292148].message }}
  • Profile picture of the author CEO_Natalie
    We did not make any SSL pages, so I am definitely surprised to even see that forum page indexed.

    Thanks for pointing that out.
    {{ DiscussionBoard.errors[292256].message }}
    • Profile picture of the author Fender85
      Thanks man, I appreciate it. And Natalie, you have the same situation with your videos category as well. Looks like the forum, videos, and then a few member pages. Here's the list of the https pages that Google has indexed - site:sta.rtup.biz/ inurl:https - Google Search

      So anyway, just get that .htaccess script implemented and as Google goes out to grab a fresh cache of those URLs it'll take care of itself. Hopefully that's just a simple little "gotcha" glitch and fixes things up. But again, if you happen to know a certain area (or areas) that aren't being indexed, it'll help with trying to diagnose what's up.

      Actually, they do have a version of your homepage indexed with the https, as well (also a few duplicate versions of the homepage). http://www.google.com/search?q=site:...hl=en&filter=0 Weird - not sure how they're accessing them, but that's a viable guess - they run into the https versions and hit the error, then back out and leave the site.
      {{ DiscussionBoard.errors[292316].message }}
    • Profile picture of the author dburk
      Hi CEO_Natalie,

      The Googlebot limits how many pages it crawls on your IP during any given hour. If they did not throttle their bot this way, it would shut down many webservers that can't handle that kind of volume. So, when you have a lot of pages to get indexed, it's going to take a while before the bot can get around to all of them.

      If you are concerned about any issues you can use Google's Webmaster Tools to verify that the googlebot is, or is not having problems crawling your site. You can even manage how Google indexes your your site there.

      Why guess when you could know?
      {{ DiscussionBoard.errors[292335].message }}
  • Profile picture of the author Fender85
    Oh, and one other thing - Checking out Yahoo!'s backlinks for your domain, they have 12800 backlinks for your domain, and all but 1000 point to your homepage. So when you consider 1000 backlinks divided into the other 89,999 pages of content on your site, there's definitely some pages being left out of the mix.

    Yahoo! is a bit slow to crawl and update new backlinks, but Google won't tell you hardly anything about the backlinks that they know about, so we just have to use Yahoo!'s numbers. So anyway, given that they only stick around for a certain period of time once they hit the site, that could leave a lot of pages not getting discovered.

    Also, Don had a great idea with the Webmaster Tools account. Who knows, you might have a spider trap and not know about it. Happened to a guy I know, once - the spider kept getting stuck in his calendar and going next, next, next, next, etc., etc until it'd finally abort and leave the site. He excluded his calendar in his robots.txt file and all was well. Worth checking into to see if they have any alerts waiting for you.
    {{ DiscussionBoard.errors[292378].message }}
    • Profile picture of the author OA
      Banned
      [DELETED]
      {{ DiscussionBoard.errors[292415].message }}
      • Profile picture of the author CEO_Natalie
        Thanks again for all the help. I had not heard the term "spider trap" before, so I am listening intently and learning everything each of you write.
        {{ DiscussionBoard.errors[293348].message }}
  • Profile picture of the author dahsyat212
    Banned
    Use forum is good way to get index on google
    {{ DiscussionBoard.errors[294240].message }}
  • Profile picture of the author trafficguru
    How do you create a sitemap for a site that is not built on WordPress? For my wordpress sites I simply got the sitemap plugin.

    Thanks,

    Adam (AKA "The Traffic Guy")
    {{ DiscussionBoard.errors[295551].message }}
    • Profile picture of the author Shafiq Kamal
      *Internal Linking
      *Deep Linking
      *Google sitemap
      *robot inclusion meta tag set to 1hour/2hour/1day based on your volume of page creation per day.
      {{ DiscussionBoard.errors[295623].message }}
  • Profile picture of the author drogers
    can you send some details about the sitemap plugin
    {{ DiscussionBoard.errors[297244].message }}
  • Profile picture of the author djoe
    i am pretty new to this. Does this stuff really work?
    {{ DiscussionBoard.errors[297348].message }}
    • Profile picture of the author echealth
      What if you have pages that are on your sitemap and haven't been fully indexed yet after a good amount of time passing? I am not talking about 2 to 3 pages either, but more like 50 including pdf's etc. No one has time to get links for all these pages either so what would be the best method to get them indexed if there is indeed a way?
      {{ DiscussionBoard.errors[2013573].message }}

Trending Topics