12 replies
  • SEO
  • |
I have been wondering that how google bot crawls a page/pages of a website. What is that it crawls first? Just need some answers for that.
#bot #crawling #google
  • Profile picture of the author jipolis7
    It depends on what exactly in your site's robots.txt file.
    Signature
    {{ DiscussionBoard.errors[10891549].message }}
  • Profile picture of the author yukon
    Banned
    Originally Posted by ArlinTelfian View Post

    I have been wondering that how google bot crawls a page/pages of a website. What is that it crawls first? Just need some answers for that.


    Google will crawl the first link/URL they find. End of story.
    {{ DiscussionBoard.errors[10892124].message }}
    • Profile picture of the author dave_hermansen
      Originally Posted by yukon View Post

      Google will crawl the first link/URL they find. End of story.
      Ding, ding, ding. We have a winner!

      Yukon is absolutely correct. They start at the first instance of a URL that they discover and start crawling from there. In some cases, the first indication they have of your website is setting up Google Webmaster Tools and submitting a sitemap. In other cases, their first crawl of your site may occur because of a link to it that they discover while crawling another website.

      After finding that first page, they will typically follow internal links on that page - both navigation links and textual links - to see where those go and will follow the links from the linked-to pages and so on and so on until they have either crawled them all or maxed out the brief crawl budget (amount of time they crawl a site) for your site. How often they crawl is mostly based on how much your site is updated.
      Signature
      BizSellers.com - The #1 place to buy & sell websites!
      We help sellers get the MAXIMUM amount for their websites and all buyers know that these sites are 100% vetted.
      {{ DiscussionBoard.errors[10893528].message }}
  • Profile picture of the author crescendo
    Googlebot recovers the substance of website pages. On the off chance that the substance it recovers has connections to different things, that is noted. It then sends the data to Google. All together for your website pages to be found in Google, they should be unmistakable to Googlebot. All together for your pages to rank ideally, all page assets must be open by Googlebot.
    {{ DiscussionBoard.errors[10892769].message }}
  • Profile picture of the author CreativeDreamrz
    Google bot crawling the website's each pages instant of those that are disallow in robots file or nofollow tags. all pages are crawing according to sitemap submitting.
    {{ DiscussionBoard.errors[10892883].message }}
  • Profile picture of the author umakant13
    Google bot crawler all crawl the content and link inserted on keyword
    {{ DiscussionBoard.errors[10892889].message }}
  • The crawling process starts with a list of web page links or URL's, that comes from a previous crawling and site maps submitted. Whenever Google bot checks the sites it gathers links that are then added to their list. So the changes you do to your website are tracked including those inactive links for it to be updated on Google index.
    Signature
    {{ DiscussionBoard.errors[10893018].message }}
  • Profile picture of the author michaelkoehler92
    After reaching a page google bot will first crawl the H1 tag.
    {{ DiscussionBoard.errors[10893226].message }}
    • Profile picture of the author yukon
      Banned
      Originally Posted by michaelkoehler92 View Post

      After reaching a page google bot will first crawl the H1 tag.

      Ridiculous.

      Like every web page on the web has an <h1> loading first.
      {{ DiscussionBoard.errors[10893272].message }}
  • Profile picture of the author xx 8c
    That's a great information! That is actually right, the Google Bot crawler search into your websites for links both active or not and they are being indexed by Google. This is why site maps are also needed or recommended for crawling and indexing purposes.
    Signature
    {{ DiscussionBoard.errors[10893435].message }}
  • Profile picture of the author paulgl
    Actually, google tries to crawl robots.tx first....

    In other words, you can create a link for google to find, then go to...but it won't crawl it if directed by the robots.txt.

    Once it passes that test, it will(maybe?) crawl the page, but only certain parts. It then "processes" bits of info that superior webmasters have done their due diligence on. It does not need, nor want, to do the whole friggin thing. That's why stupid stuff like keyword density is, well, stupid stuff.

    Then it decides on if it is worth indexing and where to place it.

    So if you have important stuff at the bottom of the page, good luck with that!

    Paul
    Signature

    If you were disappointed in your results today, lower your standards tomorrow.

    {{ DiscussionBoard.errors[10893692].message }}
  • Profile picture of the author pppanda
    As crawlers visit websites, they look for links for other pages to visit. The software pays special attention to new sites, changes to existing sites and dead links. It uses a software known as “web crawlers” to help discover publicly available web pages. The most well-known crawler is called “Googlebot.” Crawlers look at web pages and follow links on those pages, much like you would if you were browsing content on the web. They go from link to link and bring data about those web pages back to Google’s servers.
    Signature
    {{ DiscussionBoard.errors[10893991].message }}

Trending Topics