Google Analytics - Links to blocked PDFs killing my keywords

by themew
3 replies
  • SEO
  • |
Looking for suggestions to a problem that keeps coming back...

Google Webmaster tools showed me all of a sudden that my keywords were changed from my specific keywords to random words like 'choose' and 'push'. Upon further investigation, Webmaster tools showed me that Google was not only linking my PDFs (we have instruction manuals on our site) but was reading them and using the words in the PDF files, changed my keywords to the words that appeared most often in the PDF files which were more than the keywords on our site.

So, I block the PDF directory in robots.txt and within a week or so the keywords from the PDF files are gone and my keywords are back, and so is our rank on page 1 for most of our keywords.

Everything is fine until the problem comes back... Turns out that, again according to Webmaster Tools, if a page or link is linked to from another site, even if our robots.txt is blocking the files and/or directory from being crawled, links to the PDFs have caused them to be read by Google and the keywords are switched back to the multiple useless words found by Google in the PDF file.

Wow -- what a great way to destroy someone's keywords.

So, how do I fix this??? I can't block the URL as Google advises that it returns a 404 error which means deleting the PDF which I can't do. I can change the filename of the PDFs and enter a 'delete url' on Google, but the linked sites will eventually find the new PDF file and relink to it.

Anyone experience something like this before? Any ideas how to fix this problem?
#analytics #blocked #google #keywords #killing #links #pdfs
  • Profile picture of the author yukon
    Banned
    You can do two things to stop Google from reading those pdf files.

    1) Block the pdf files with your sites .htaccess file (post #14 & #15)

    2) Or, save the pdf files inside of a .zip file then upload to your site (remove old pdf files).

    Google search can't open/read the contents of a .zip file.
    {{ DiscussionBoard.errors[3509346].message }}
  • Profile picture of the author themew
    Yukon -- thanks for the reply.

    Seems as though (just from reading up on cpanel hotlink protection) just blocking the leeching or linking to the PDF will have no effect on Google's crawlers, since even if the offending sites can't link to the PDF, the link will still remain on their site.

    Could be the only other way is to change the PDF to a zip file, but that defeats the ability to read the PDF online, like with Chrome, rather than have to physically download it, unzip it and then read it... That will stop Google from reading the PDF but could upset our customers.

    I can't believe that a link overrides a 'disallow' in a robots.txt file

    Any idea on the consequences of telling Webmaster Tools to remove the link from being crawled but leave the link live??
    {{ DiscussionBoard.errors[3509546].message }}
  • Profile picture of the author yukon
    Banned
    I've blocked Google from adding my images (specific image formats) with the .htaccess code in my last comment.

    It's easy to do from the Hostgator Cpanel/hotlink protection.

    I've never tried to block a pdf, but I don't see why it wouldn't work.
    {{ DiscussionBoard.errors[3509579].message }}

Trending Topics