How Do I Unindex a pdf file on my server

by ezjob
10 replies
Hello,

I uploaded a pdf file to my website.

website.com/name-of-file.pdf

Google has indexed it.

I didn't know that search engines indexed uploaded files.

How do I unindex the file and stop this from happening again?

I have a wordpress blog and I am able to prevent certain pages from being indexed but this has thrown me for a loop.

I assumed that uploads to the website would not be indexed.

Thanks for you help,
Ja
#file #pdf #server #unindex
  • Profile picture of the author KirkMcD
    Originally Posted by ezjob View Post

    I assumed that uploads to the website would not be indexed.
    If there is a link to it from anywhere, expect it to get indexed.

    Remove information from Google: Remove a page or site from Google's search results

    This will ONLY remove it from Google, not from any other SE.
    {{ DiscussionBoard.errors[3902235].message }}
    • Profile picture of the author ezjob
      Originally Posted by KirkMcD View Post

      If there is a link to it from anywhere, expect it to get indexed.

      Remove information from Google: Remove a page or site from Google's search results

      This will ONLY remove it from Google, not from any other SE.
      Thanks for the help but is there something I can change on my server or blog to unindex this.

      I've already changed the name of the file so it isn't accessible in the search results.

      I would like to know more about this.

      Does a robots.txt file work on a wordpress blog with my own server?
      {{ DiscussionBoard.errors[3907887].message }}
      • Profile picture of the author Karen Blundell
        here's what I would do if I were you. Create a directory to put your .pdf downloads in

        Put that .pdf in that new directory

        Now if don't have one already, create a robots text file. this file must be uploaded to your root directory, usually "public_html"

        what you would add to the file is something like this:

        User-agent: *
        Disallow: /directory-name/


        This will stop all search engines and bots from indexing anything within that directory.

        for more information about how to create a robots.txt file, visit:
        The Web Robots Pages

        Update: yes, robots.txt files work with WordPress sites. I have one.
        Signature
        ---------------
        {{ DiscussionBoard.errors[3908553].message }}
        • Profile picture of the author ezjob
          I guess to create a directory you mean create a folder and name it anything.

          Then disallow the name of that folder with the pdf.

          I guess for each pdf in the folder you would access it by:

          folder-name/name-of-pdf.pdf

          Is this correct?

          Thanks,
          Ja
          {{ DiscussionBoard.errors[3911215].message }}
          • Profile picture of the author Joseph Wilcox
            Originally Posted by ezjob View Post

            I guess to create a directory you mean create a folder and name it anything.

            Then disallow the name of that folder with the pdf.

            I guess for each pdf in the folder you would access it by:

            folder-name/name-of-pdf.pdf

            Is this correct?

            Thanks,
            Ja
            Yes. Folder/Directory would be interchangeable here.

            You would then disallow it in the robots.txt

            Mind you, not all search engines obey the robots.txt file.

            Are you linking to the .PDF file from a page on your site? Is the directory/folder that you're keeping the .PDF files in publicly viewable?

            JW
            {{ DiscussionBoard.errors[3912293].message }}
            • Profile picture of the author ezjob
              Originally Posted by Joseph Wilcox View Post

              Yes. Folder/Directory would be interchangeable here.

              You would then disallow it in the robots.txt

              Mind you, not all search engines obey the robots.txt file.

              Are you linking to the .PDF file from a page on your site? Is the directory/folder that you're keeping the .PDF files in publicly viewable?

              JW
              I'm not linking to the file but it is accessible with the correct link.

              It is in my public_html folder.

              Publicly viewable? Do you mean uploaded to a web page?
              {{ DiscussionBoard.errors[3915897].message }}
              • Profile picture of the author newbie365
                I would put the file below public_html and then use php headers to load it for download, doing it this way makes the file not directly accessable to the public but still allows you to serve the file. Also this way you can make the download filename dynamic so that each download has a diff filename.pdf

                for example time().'pdf'; when output to the browser..

                hope this helps
                {{ DiscussionBoard.errors[3920774].message }}
                • Profile picture of the author ezjob
                  Originally Posted by newbie365 View Post

                  I would put the file below public_html and then use php headers to load it for download, doing it this way makes the file not directly accessable to the public but still allows you to serve the file. Also this way you can make the download filename dynamic so that each download has a diff filename.pdf

                  for example time().'pdf'; when output to the browser..

                  hope this helps
                  I don't understand please show example of how to create a php header for a pdf.
                  {{ DiscussionBoard.errors[3922518].message }}
  • Profile picture of the author hhunt
    One thing you can do is remove it from your server or use robot.txt to deny access to the files you don't want indexed.

    A quick search for robot.txt should show you how to do it.

    Good luck!
    {{ DiscussionBoard.errors[3915250].message }}
  • Profile picture of the author clonepal
    Create a directory, put your file there and protect it with .htaccess so SE will not be able to index anymore.
    {{ DiscussionBoard.errors[3922992].message }}

Trending Topics