What is difference between crawling, indexing and caching?

17 replies
  • SEO
  • |
Hello,

What is difference between crawling, indexing and caching?

How caching, crawling and indexing works. Which one has higher priority? How methods are carried out. What Google do? Does it first caches the pages/websites or crawl the data on the page.

Note: Please do not post any definition or Wikipedia article as a source. Replies from senior member will be appreciated.
#caching #crawling #difference #indexing
  • Profile picture of the author SimpleSEOTips
    In laymans terms:
    Google crawls your site and then indexes what it sees as a cached version of the page.
    The page may change in design before they crawl and then reindex the new page hence the term cache is used as almost a caveat to say that the page may've changed since they crawled it.
    If your webpages aren't crawled then they can't be indexed. Making sure your site can be crawled by bots is a priority. Set up a Google Webmaster Tools account and then submit a XML sitemap to help Google crawl and index it.
    Hope that helps.
    Signature

    If you found my contribution to this thread valuable I would encourage you to visit my blog where you’ll find plenty of free SEO tips and recommendations for the best SEO Tools.

    {{ DiscussionBoard.errors[4688553].message }}
  • Profile picture of the author sanjon
    Crawling- Google sends its spiders to your website..
    Indexing- Google visited your website and has added you to its database..

    Caching- Google toook a snapshot of your website when it last visited and stored the data in case your website went down or if there are some any other issues.
    {{ DiscussionBoard.errors[4689102].message }}
    • Profile picture of the author agendapal
      Banned
      [DELETED]
      {{ DiscussionBoard.errors[4689617].message }}
      • Profile picture of the author android45
        Crawling is something like Search engine bots visited your site and indexing is saving your information in database and caching is the more detail information of indexing like when visited and time , now you can know which goes first.
        {{ DiscussionBoard.errors[4689675].message }}
  • Crawling is the process of an engine requesting — and successfully downloading — a unique URL.
    Indexing is the result of successful crawling. I consider a URL to be indexed (by Google) when an info: or cache: query produces a result, signifying the URL’s presence in the Google index.
    It means, if we uploaded a new website, then first of all search engine crawler will read the site and after that, it will store all its contents in its Index Data Base in a different format, it will not place content as it was published. As a result, the site will appear in search results for optimized keywords.
    {{ DiscussionBoard.errors[4689495].message }}
  • Profile picture of the author brookecaitlin
    Crawling is where search engines spiders / bots move from web page to web page by following the links on the pages. The pages "found" are then ranked using an algorithm and indexed into the search engine database.

    Indexing is where search engine has crawled the web and ranks the URLs found using various criteria and places them in the database, or index.
    {{ DiscussionBoard.errors[6708080].message }}
  • Profile picture of the author greatestmj
    Crawling means google has crawled your site atleast once and your site goes into their crawled database. Indexing means google has crawled your site and found it valuable enough to put it in their indexed pages database.
    {{ DiscussionBoard.errors[6708749].message }}
  • Profile picture of the author CyborgX
    "Crawling is where search engines spiders / bots move from web page to web page by following the links on the pages. The pages "found" are then ranked using an algorithm and indexed into the search engine database.Caching is where copies of web pages stored locally on an Internet user's hard drive or within a search engine's database. "
    {{ DiscussionBoard.errors[7479235].message }}
    • Profile picture of the author chrissmit
      Does Google base its ranking on the site that is cached?
      Or does Google crawl a site more often than it is cached?

      My site has last been cached on March 21st; I'm writing this April 3rd.
      Is my Google rank based on the content of the 21st March or could it be based on a later date?
      {{ DiscussionBoard.errors[7930401].message }}
      • Profile picture of the author Ralf Skirr
        Google crawls a site more often than it caches the site.

        You can check this easily by searching for some non-cached content. Google will show it in search results, even if it is not cached.

        For example: you added some new content to your site 3 days ago. The cache of your site home page is 3 weeks old. Googleing a sentence from your new content in quotes will still show the new content in results.

        Unless, of course, the site is so infrequently updated and unimportant that Google doesn't even bother to crawl it often.

        I find that my new blog posts are crawled and indexed within minutes or a few hours. The new page will be cached immediately or within 2 or 3 days. But it might take up to 2 weeks until I can find a cashed version of my blog homepage with the excerpts from new articles.

        These time frames are different for each web site.
        {{ DiscussionBoard.errors[8020554].message }}
  • Profile picture of the author Hansons
    Good tutorial about crawling, indexing and catching, it removed my doubts too, learned too.

    Thanks
    Signature

    Is your website Hacked? Try -> www.sitebeak.com
    Is Google Analytics installed Properly? Test -> www.GAtective.com
    Impersonal Google search? Check -> www.impersonal.me

    {{ DiscussionBoard.errors[8020675].message }}
  • Profile picture of the author Aaditya321
    how to find when our site crawled and index whatever caching we can find easily with time as well as date.
    {{ DiscussionBoard.errors[9127278].message }}
  • Profile picture of the author chrissmit
    You can check what is Indexed (so also crawled) by going to Google and typing: "site:http://YourSite.com"
    All without the " "

    You can check what's Cached by going to Google and typing:
    "cache:http://YourSite.com"
    All with the " "
    {{ DiscussionBoard.errors[9148147].message }}
    • Profile picture of the author Laureen Davis
      Originally Posted by chrissmit View Post

      You can check what is Indexed (so also crawled) by going to Google and typing: "site:http://YourSite.com"
      All without the " "

      You can check what's Cached by going to Google and typing:
      "cache:http://YourSite.com"
      All with the " "
      Thanks for sharing the simple way to check the whether the website is crawled or not. I think rather than using Add-on, using this technique is quite useful.
      {{ DiscussionBoard.errors[10356738].message }}
  • Profile picture of the author chandanthaver
    Indexing- Added your website in to its database after crawling.
    Crawling- when the website visited by the spider for checking the websites.
    Caching- when the last visited and taking snapshot of the website.
    {{ DiscussionBoard.errors[9148527].message }}
  • Profile picture of the author Manasarao
    thanks for well explanation. understood clearly
    {{ DiscussionBoard.errors[10356948].message }}
  • Profile picture of the author bodyrx
    * Crawling is where search engines spiders or bots visit from web page to web page by the links on the pages.
    * Indexing is where search engine has crawled the web and ranks the URLs found using various criteria and places them in the database.
    * Caching is where copies of web pages stored locally on an Internet user's hard drive or within a search engine's database.
    {{ DiscussionBoard.errors[11162026].message }}
  • Crawling is when google spider visits your sites pages one by one. Indexing is when google will save your sites information into its database.
    And in caching it saves a new web page in the google cache database which is read to be indexed.
    {{ DiscussionBoard.errors[11162045].message }}

Trending Topics