Go Back   WarriorForum - Internet Marketing Forums > The Warrior Forum > Main Internet Marketing Discussion Forum
Register Blogs FAQ Social Groups CalendarHelp Desk

Reply
 
LinkBack Thread Tools
Old 06-24-2009, 05:25 AM   #1
HyperActive Warrior
War Room Member
 
howdo-i's Avatar
 
Join Date: Nov 2004
Location: , , United Kingdom.
Posts: 458
Thanks: 5
Thanked 19 Times in 16 Posts
Default PDF copies of HTML pages - is it a bad idea?

I want to offer my visitors the choice of viewing my content in html or pdf format but if I do this I will be creating duplicate content in 2 different formats. Does anyone know how the search engines view this? Will they index both versions, one version or no versions?

Thanks for any insight you might have.
Steve

howdo-i is offline   Reply With Quote
Old 06-24-2009, 06:30 AM   #2
Advanced Warrior
War Room Member
 
Colin Evans's Avatar
 
Join Date: May 2003
Location: Still Looking... Currently back in Zim...
Posts: 674
Thanks: 117
Thanked 64 Times in 28 Posts
Social Networking View Member's Twitter Profile 
Default Re: PDF copies of HTML pages - is it a bad idea?

Google will index both versions, but if you save all your pdf files in one folder you can use a robots.txt file to instruct the search engines not to index anything in the folder.

Just add this to your robots.txt file:

User-agent: *
Disallow: /your-pdf-folder/

Sig not working today - too hung over...
Colin Evans is offline   Reply With Quote
Old 06-24-2009, 06:58 AM   #3
HyperActive Warrior
War Room Member
 
howdo-i's Avatar
 
Join Date: Nov 2004
Location: , , United Kingdom.
Posts: 458
Thanks: 5
Thanked 19 Times in 16 Posts
Default Re: PDF copies of HTML pages - is it a bad idea?

Quote:
Originally Posted by Colin Evans View Post
Google will index both versions, but if you save all your pdf files in one folder you can use a robots.txt file to instruct the search engines not to index anything in the folder.

Just add this to your robots.txt file:

User-agent: *
Disallow: /your-pdf-folder/
Thanks for the robots.txt option but what if I don't do that and Google indexes both versions? Will Google disregard one version for the search listings or worse, will they penalise me and not list either version?

Best case would be to have both versions sow up in SE results.

Steve

howdo-i is offline   Reply With Quote
Old 06-24-2009, 07:14 AM   #4
Advanced Warrior
War Room Member
 
Colin Evans's Avatar
 
Join Date: May 2003
Location: Still Looking... Currently back in Zim...
Posts: 674
Thanks: 117
Thanked 64 Times in 28 Posts
Social Networking View Member's Twitter Profile 
Default Re: PDF copies of HTML pages - is it a bad idea?

Hi Steve,

Google will not penalise you - if you want to read how Google treats duplicate content, I wrote about it here: Duplicate Content Penalty vs. Duplicate Content Filters – The Truth Revealed

It's quite possible both versions will be displayed in the search results, I've never had a pdf outrank a post, but I suppose it's possible. In the end it depends which gets the most incoming links...

Sig not working today - too hung over...
Colin Evans is offline   Reply With Quote
Old 06-24-2009, 07:39 AM   #5
Senior Warrior Attorney
War Room Member
 
Join Date: Jul 2004
Location: Jedi Temple
Posts: 2,904
Blog Entries: 32
Thanks: 70
Thanked 2,177 Times in 640 Posts
Default Re: PDF copies of HTML pages - is it a bad idea?

In my experience Google will not penalize you. However, I've seen Google give a better ranking to a pdf copy (definitely had fewer incoming links than the original html page) - so it is something I've tried to avoid.

kindsvater is offline   Reply With Quote
Old 06-24-2009, 07:46 AM   #6
Donato Spagnolo
War Room Member
 
dspa72's Avatar
 
Join Date: May 2006
Location: Italy
Posts: 308
Thanks: 55
Thanked 88 Times in 36 Posts
Contact Info
Send a message via MSN to dspa72
Default Re: PDF copies of HTML pages - is it a bad idea?

Google will not penalize you. But I suggest you to avoid this duplicate content on your site. You could put the pdf in a zip file, for example

dspa72 is offline   Reply With Quote
Old 06-24-2009, 07:49 AM   #7
HyperActive Warrior
War Room Member
 
howdo-i's Avatar
 
Join Date: Nov 2004
Location: , , United Kingdom.
Posts: 458
Thanks: 5
Thanked 19 Times in 16 Posts
Default Re: PDF copies of HTML pages - is it a bad idea?

Quote:
Originally Posted by Colin Evans View Post
Hi Steve,

Google will not penalise you - if you want to read how Google treats duplicate content, I wrote about it here: Duplicate Content Penalty vs. Duplicate Content Filters – The Truth Revealed

It's quite possible both versions will be displayed in the search results, I've never had a pdf outrank a post, but I suppose it's possible. In the end it depends which gets the most incoming links...
Thanks Colin, that's a very usefull post. It seems that all I have to do to work out what Google will do is apply common sense. That isn't always obviouse lol.

Steve

howdo-i is offline   Reply With Quote
Old 06-24-2009, 07:52 AM   #8
HyperActive Warrior
War Room Member
 
howdo-i's Avatar
 
Join Date: Nov 2004
Location: , , United Kingdom.
Posts: 458
Thanks: 5
Thanked 19 Times in 16 Posts
Default Re: PDF copies of HTML pages - is it a bad idea?

Quote:
Originally Posted by kindsvater View Post
In my experience Google will not penalize you. However, I've seen Google give a better ranking to a pdf copy (definitely had fewer incoming links than the original html page) - so it is something I've tried to avoid.
I would be very pleased for the pdf's to rank higher than my pages because Google isn't exactly responding to my efforts yet lol.

Steve

howdo-i is offline   Reply With Quote
Old 06-24-2009, 07:54 AM   #9
HyperActive Warrior
War Room Member
 
howdo-i's Avatar
 
Join Date: Nov 2004
Location: , , United Kingdom.
Posts: 458
Thanks: 5
Thanked 19 Times in 16 Posts
Default Re: PDF copies of HTML pages - is it a bad idea?

Quote:
Originally Posted by dspa72 View Post
Google will not penalize you. But I suggest you to avoid this duplicate content on your site. You could put the pdf in a zip file, for example
Does Google not read content in zip files? It reads and indexes pdf files and operating systems have been viewing zip files as directories for years. I assumed that G would look inside zip files too.

Steve

howdo-i is offline   Reply With Quote
Old 06-24-2009, 09:14 AM   #10
Donato Spagnolo
War Room Member
 
dspa72's Avatar
 
Join Date: May 2006
Location: Italy
Posts: 308
Thanks: 55
Thanked 88 Times in 36 Posts
Contact Info
Send a message via MSN to dspa72
Default Re: PDF copies of HTML pages - is it a bad idea?

In my experience, I've never seen a search result coming from zip file. In this case, google just indexes the file name which can be found using the directive filetype:zip


Quote:
Originally Posted by howdo-i View Post
Does Google not read content in zip files? It reads and indexes pdf files and operating systems have been viewing zip files as directories for years. I assumed that G would look inside zip files too.

Steve

dspa72 is offline   Reply With Quote
Old 06-24-2009, 09:55 AM   #11
HyperActive Warrior
War Room Member
 
howdo-i's Avatar
 
Join Date: Nov 2004
Location: , , United Kingdom.
Posts: 458
Thanks: 5
Thanked 19 Times in 16 Posts
Default Re: PDF copies of HTML pages - is it a bad idea?

Quote:
Originally Posted by dspa72 View Post
In my experience, I've never seen a search result coming from zip file. In this case, google just indexes the file name which can be found using the directive filetype:zip
Hmmm come to think of it neither have I. Thanks for pointing this out.

Steve

howdo-i is offline   Reply With Quote
Reply

  WarriorForum - Internet Marketing Forums > The Warrior Forum > Main Internet Marketing Discussion Forum

Tags
bad, copies, duplicate content, html, idea, pages, pdf

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off



All times are GMT -6. The time now is 10:09 PM.