What's the best way to prevent Google indexing a bunch of duplicate pages?

by Chris-
17 replies
I have a new site where I will be making many duplicates of a particular page, each with a different code on, in order to implement an easy manual affiliate system (for various reasons I am choosing not to use a standard affiliate system at present), but obviously I don't want Google to index or look at a whole bunch of nearly identical duplicate pages.

What's the best way to prevent Google looking at, or indexing, those pages?

thanks in advance for any answers on this


Chris
#bunch #duplicate #google #indexing #pages #prevent
  • Profile picture of the author JOSourcing
    Banned
    Google instructs how to prepare duplicate content on this page: Duplicate content - Webmaster Tools Help
    {{ DiscussionBoard.errors[5733902].message }}
  • Profile picture of the author Chris-
    thanks

    Chris
    {{ DiscussionBoard.errors[5733910].message }}
    • Profile picture of the author JOSourcing
      Banned
      Originally Posted by Chris- View Post

      thanks

      Chris
      You're welcome!
      {{ DiscussionBoard.errors[5733986].message }}
  • Profile picture of the author Chris-
    I'll use
    <meta name="robots" content="noindex">

    on the pages I don't want indexed, thanks

    Chris
    {{ DiscussionBoard.errors[5733929].message }}
    • Profile picture of the author Claire Koch
      I was going to suggest that but I wanted to tell you thats not 100% infallible. Remember they are still looking at it.

      Originally Posted by Chris- View Post

      I'll use
      <meta name="robots" content="noindex">

      on the pages I don't want indexed, thanks

      Chris
      {{ DiscussionBoard.errors[5734034].message }}
      • Profile picture of the author Chris-
        Originally Posted by Claire Koch View Post

        I was going to suggest that but I wanted to tell you thats not 100% infallible. Remember they are still looking at it.
        Yes, I understand what you're saying. Thanks!


        Chris
        {{ DiscussionBoard.errors[5734245].message }}
  • Profile picture of the author ralphnsk
    use this on your duplicate pages

    <meta name="canonical" content="http://your-main-page" />
    {{ DiscussionBoard.errors[5736052].message }}
  • Profile picture of the author Chris-
    Can anyone say which of the two methods (canonical or noindex) is better, and exactly why?

    Chris
    {{ DiscussionBoard.errors[5736143].message }}
  • Profile picture of the author agc
    Why not just hide them in your robots.txt?
    {{ DiscussionBoard.errors[5738292].message }}
    • Profile picture of the author TerryL
      For WordPress sites, I use the Robots Meta plugin to tell the search engines which pages I want indexed and which ones I don't. It's a free plugin and makes things very simple.
      {{ DiscussionBoard.errors[5738832].message }}
  • Profile picture of the author John Romaine
    Chris, on pages that are very similiar, I would recommend implementing the canonical tag.

    For example, lets say you have the two following pages, that are basically identical, except of course the trailing querystring values are different.

    www.yoursite.com/product.php?ID=345
    www.yoursite.com/product.php?ID=290

    It would be best to simply use the canonical tag and set it to...


    Code:
     
    <linkrel="canonical" href="www.yoursite.com/product.php">
    As a precaution, I always ensure that the page specified within the canonical tag will load without any issues, as this might cause problems, should Google index a page that fails to load. (This might be the case if the querystrings are essential to load given information on the page via an embedded SQL query, or stored procedure.)

    As for restricting access to certain pages, I would do this via your robots.txt file. You'll find its much easier to do this rather than updating multiple pages across your site.

    Because of course, you'll only need to change the one file.

    Hope I havent missed anything here.

    Best of luck
    Signature

    BS free SEO services, training and advice - SEO Point

    {{ DiscussionBoard.errors[5739011].message }}
  • Profile picture of the author Chris-
    So, between

    <meta name="canonical" content="http://your-main-page" />

    and

    <linkrel="canonical" href="www.yoursite.com/product.php">

    what's the difference, which is better and why?

    and, why is that better (or not) than hiding them in Robots.txt ?

    thanks


    Chris
    {{ DiscussionBoard.errors[5740733].message }}
    • Profile picture of the author John Romaine
      Originally Posted by Chris- View Post

      So, between

      <meta name="canonical" content="http://your-main-page" />

      and

      <linkrel="canonical" href="www.yoursite.com/product.php">

      what's the difference, which is better and why?
      Chris, you set canonical tags on individual pages. It's not a case of "which one's better?" ....they're assigned to specific pages on your site to instruct the search engines, as to what your preferred URL is (and what url you want indexed).

      You really should do some reading on this subject.

      Here are a few resources that might help you.

      Official Google Webmaster Central Blog: Specify your canonical

      About rel="canonical" - Webmaster Tools Help

      Originally Posted by Chris- View Post

      and, why is that better (or not) than hiding them in Robots.txt ?
      I think you're confusing the purpose behind the two. Your robots.txt file allows you to specify exclusions. Your canonical tag allows you to specify (as said above) your preferred URL.

      Here's some information on the robots.txt file.

      The Web Robots Pages

      My preference is to utilise the robots.txt file to set exclusions, as (to me) it only makes sense to do this within one file (your robots.txt file) as opposed to individual pages - which of course might be tedious if you have a relatively large site.
      Signature

      BS free SEO services, training and advice - SEO Point

      {{ DiscussionBoard.errors[5741246].message }}
  • Profile picture of the author Alston
    If you can create one page and use code to modify it based on the query string, that would be the best way to go. Then you can just tell Google and Bing to ignore the parameter.

    It sounds like all the different pages are the same except for a few details. You can have someone write code that will display the unique graphics and/or text that goes with the specific query string.

    I do this with my PPC adgroups. If you click on one of my ads that promotes auto insurance in Michigan, the page will say "Michigan Auto Insurance Quotes." Instead of having 51 different pages, I just pull the state from the query string.

    If you need to include more data than just the state name, you would put the line number (primary key) or some other unique value like an affiliate code of a table in the query string. The code would grab the data from that row of your table and add it to the page.

    This might be the best long-term solution, but I program my own sites. It may be easier to use one of the above suggestions rather than deal with a programmer.

    That being said, it doesn't sound like something a programmer would charge an arm and a leg for.
    {{ DiscussionBoard.errors[5741291].message }}
  • Profile picture of the author Chris-
    Thanks John, I've had a quick read of some of the links already. Thanks for your comments.

    Thanks Alston, to start with, I'll do things an easier way which is just writing my own scripts to automatically generate all the similar pages in a few seconds. If the site becomes profitable I'll look into better ways of doing everything later, just want to get it going first


    Chris
    {{ DiscussionBoard.errors[5741820].message }}
  • Profile picture of the author John Romaine
    Chris, just some quick advice mate.

    If your site is quite large, you'll want to make sure that you get it right the first time round.

    It can be a real mess if you screw this up and you're left to try and fix it later. Especially once Google starts indexing pages.
    Signature

    BS free SEO services, training and advice - SEO Point

    {{ DiscussionBoard.errors[5741851].message }}
  • Profile picture of the author Chris-
    OK, I understand the point

    thanks!

    Chris
    {{ DiscussionBoard.errors[5741863].message }}

Trending Topics