how to protect files and thank you page from google & hacker?

17 replies
hey guys!
I'm looking for the best way to protect some files in my site from spiders of google and hacker who looking to get free stuff without buying it.
I found robot.txt file is the most used tool but even this file can't protect all data (I think) or here is some other solutions better?
#files #google #hacker #page #protect
  • Profile picture of the author Derek S
    I just did this today after noticing people accessing my download page without buying on Google analytics...

    If you are using paypal make sure your html buy button code is encrypted.

    If my money site is golf-swing.com I will host the download page on a different domain such as golf-swing.net or something random such as your-name.com

    This will almost eliminate all theft. Make sure you're not hosting all your product download pages on one domain though for obvious reasons lol
    Signature

    --- Work Smart... Not Hard ---

    {{ DiscussionBoard.errors[2535456].message }}
    • Profile picture of the author spearce000
      Originally Posted by Derek S View Post

      I just did this today after noticing people accessing my download page without buying on Google analytics...

      If you are using paypal make sure your html buy button code is encrypted.

      If my money site is golf-swing.com I will host the download page on a different domain such as golf-swing.net or something random such as your-name.com

      This will almost eliminate all theft. Make sure you're not hosting all your product download pages on one domain though for obvious reasons lol
      Good advice. I'd go one stage further and have the download page in a directory on the second site rather than on the default directory. You might also make sure the robots.txt file on the second domain tells robots not to crawl this directory. I found a site that will generate a robots.txt file for you here: Robots.txt Generator.

      You should also add the following meta tags between the <head> and </head> area of your download page:

      <meta name="robots" content="nofollow,noarchive,noindex">
      <meta name="revisit" content="never">

      That should stop your page being indexed by search engines, and download links to your product being followed.
      {{ DiscussionBoard.errors[2539168].message }}
  • Profile picture of the author tpw
    On my thank you page, in the php code, I say if the computer landing on this page is a bot,

    $page_countent = "";
    else $page_content = "is human";

    As other poster said, also encrypt your payment button.

    The challenge is that you can easily identify the spiders, and feed them a blank page...

    But stopping the determined hacker is often very difficult and beyond the scope of a simple answer.
    Signature
    Bill Platt, Oklahoma USA, PlattPublishing.com
    Publish Coloring Books for Profit (WSOTD 7-30-2015)
    {{ DiscussionBoard.errors[2535500].message }}
    • Profile picture of the author xiaophil
      ...you can easily identify the spiders...
      How are you identifying the spiders?

      Many of the bad bots I have seen spoof the user agent to look like a human.

      I don't see the point of serving a blank page to a 'good' bot as it will obey robots.txt and keep out of your download area anyway (assuming you have disallowed that directory). You should never get so far as having to serve a page to them. The flipside is that the same entry in robots.txt is effectively an advertisement for the bad bots and scrapers - and in my experience these are far from easy to detect.

      After manually trawling logs for some months I am convinced that a fair proportion of 'visitors' are actually various kinds of bots attempting to reverse engineer the sites - especially the ones that rank.

      The more advanced bots can also accept cookies and interpret JavaScript which makes managing them or even distinguishing them from real people quite a challenge.

      There are ways, but IMO it's far from easy, unless perhaps I am missing a trick (am I?).

      Next time you tweet a URL take a look at your raw logs - most of that is bots.

      While 100% detection rate is most likely impossible, at some point I am going to flick the switch and tell them all to fsck off. Sure there will be some false positives but if someone is blanking their referrer, spoofing their user agent and going through a proxy while rejecting cookies and not running script then perhaps they don't fit my customer profile anyway.

      For anyone using G.Analytics or other JS based tracking don't forget that most of the nasties (that I have seen anyway) are still unable or unwilling to execute JavaScript and therefore don't leave a trace, while your site could be literally crawling with scrapers and bots. If you rank for anything it's even more likely due to the plethora of SEO reverse engineering software (I won't name any names).

      A bot is never going to join your list, buy from you or add a meaningful comment to your blog. They just burn your bandwidth, rip your content or try to reverse engineer your hard work. Aside from the legit search engine crawlers, AFAIAC they can all p*ss off.

      Now the short answer - for downloads try DLGuard - it's very nice.

      Cheers,
      Phil
      {{ DiscussionBoard.errors[2536050].message }}
      • Profile picture of the author tpw
        Originally Posted by xiaophil View Post

        How are you identifying the spiders?

        Many of the bad bots I have seen spoof the user agent to look like a human.
        I was actually only referencing the good bots (like googlebot) when I said that. They can be identified through the HTTP Agent...
        Signature
        Bill Platt, Oklahoma USA, PlattPublishing.com
        Publish Coloring Books for Profit (WSOTD 7-30-2015)
        {{ DiscussionBoard.errors[2536106].message }}
        • Profile picture of the author xiaophil
          Originally Posted by tpw View Post

          I was actually only referencing the good bots (like googlebot) when I said that. They can be identified through the HTTP Agent...
          With respect, you can't identify a good bot purely from the HTTP User Agent - the malicious ones will spoof the UA and tell you it's googlebot or a friendly browser.

          The only way I currently know to validate a 'good' crawler is through reverse-DNS to known good hostname. I would be interested in hearing about any others.

          Anyway if it's a good bot then why do you need to serve them a blank page? They will obey robots.txt. Or you could make the link nofollow?

          Cheers,
          Phil
          {{ DiscussionBoard.errors[2536172].message }}
          • Profile picture of the author tpw
            Originally Posted by xiaophil View Post

            Anyway if it's a good bot then why do you need to serve them a blank page? They will obey robots.txt. Or you could make the link nofollow?
            The OP asked about blocking Google...

            I know of one fellow here in the WF, who insists that his forum is setup on robotx.txt to tell the good bots not to crawl his forum...

            Yet, you can find the posts in his private forum showing up in Google's search results...

            I read an article from Google suggesting that while they do honor nofollows, they also believe it is important to at least look at the content of the page behind the nofollow...

            This is the only reason I would think that Google will have crawled a private forum that the forum's owner told Google not to crawl in its robots.txt.

            Serving a blank page to Google would be a technique only to ensure that the content you want to hide from Google will truly be hidden...


            p.s. I don't understand why Google actually has archived this fellow's private forum... They should have stopped at the robots.txt -- but you can see his forum pages in Google's search results, and if you click the Archive link, you can see what was discussed on that page...

            Maybe the forum owner did his robots.txt wrong... I don't know...
            Signature
            Bill Platt, Oklahoma USA, PlattPublishing.com
            Publish Coloring Books for Profit (WSOTD 7-30-2015)
            {{ DiscussionBoard.errors[2536220].message }}
  • Profile picture of the author brendan301
    i'm looking for a solution to that myself. dlguard gets alot of good reviews here. i believe it's $97, however there are other download page protectors out there that are cheaper.
    {{ DiscussionBoard.errors[2535567].message }}
  • Profile picture of the author quickcashstrategy
    Banned
    I've but I don't how to use it as I want, and I don't know why they didn't give videos training
    {{ DiscussionBoard.errors[2535863].message }}
    • Profile picture of the author bretski
      ejunkie is pretty good. Cheap and the solution that I would recommend. As far as denying using robots.txt; not a good idea. I believe that ejunkie also puts the customers name on the file so you can track it if someone starts sharing your stuff out.
      Signature
      ***Affordable Quality Content Written For You!***
      Experience Content Writer - PM Bretski!
      {{ DiscussionBoard.errors[2535880].message }}
  • Profile picture of the author tehnolife
    Banned
    Are so many sites that do this for you..... like : ejunkie.com , clickbank.com ,payloadz.com...etc. This will stop a lot of thieves!

    Stefan
    {{ DiscussionBoard.errors[2535883].message }}
    • Profile picture of the author bretski
      Originally Posted by tehnolife View Post

      Are so many sites that do this for you..... like : ejunkie.com , clickbank.com ,payloadz.com...etc. This will stop a lot of thieves!

      Stefan
      Sorry...clickbank doesn't do squat for protecting your thank you or download page
      Signature
      ***Affordable Quality Content Written For You!***
      Experience Content Writer - PM Bretski!
      {{ DiscussionBoard.errors[2535888].message }}
      • Profile picture of the author tehnolife
        Banned
        Clickbank make sure that your product will delivery good....so it's protecting your download page!!

        Originally Posted by bretski View Post

        Sorry...clickbank doesn't do squat for protecting your thank you or download page
        {{ DiscussionBoard.errors[2536043].message }}
        • Profile picture of the author Mohammad Afaq
          Originally Posted by tehnolife View Post

          Clickbank make sure that your product will delivery good....so it's protecting your download page!!
          Nope, all clickbank does is puts big buttons on the page after a person buys that link to the download page. That way, the buyer get the download for sure and isn't confused but as far as protecting the download page, clickbank doesn't do anything.
          Signature

          “The first draft of anything is shit.” ~Ernest Hemingway

          {{ DiscussionBoard.errors[2536155].message }}
  • Profile picture of the author vishalduggal
    You can use third party payment processors like ejunkie to protect your downloads.
    {{ DiscussionBoard.errors[2583108].message }}
  • Profile picture of the author Shaun OReilly
    Originally Posted by quickcashstrategy View Post

    hey guys!
    I'm looking for the best way to protect some files in my site from spiders of google and hacker who looking to get free stuff without buying it.
    I found robot.txt file is the most used tool but even this file can't protect all data (I think) or here is some other solutions better?
    The best way I've found to protect my files is to use
    DLGuard:

    DLGuard - Download page protector, create expiring download links (no affil.)

    It's mega-secure. Why?

    I place the downloads in a non-web accessible folder
    that simply can't be indexed or crawled.

    And no-one can access the download until they've
    completed their payment via PayPal. Only DLGuard
    can give them access to the download area after
    that.

    Dedicated to your success,

    Shaun
    Signature

    .

    {{ DiscussionBoard.errors[2583122].message }}
    • Profile picture of the author Jeff Bush
      Today I noticed that Google has placed my entire book in preview that should only be viewed if purchased.
      I wouldn't have found this except I discovered my sidebar with optin autoresponder form was missing along with advertising. I went to google search to see about clicking on the "cached" as that should solve that problem but discovered the book was there free to download. I went back to admin part of site & I checked the thank you page for download and it is set like it is supposed to be. DL guard is another purchase that is weeding out the money starved weaklings like me. Is there a free alternative plugin recommended?
      {{ DiscussionBoard.errors[5958736].message }}

Trending Topics