• How do I prevent Google from including my PDF file on my site through the search engine? I sell an eBook in PDF format. When you search my name and the topic of the book, a link appears that automatically downloads my eBook. Is it too late to remove the book?

    The www at the bottom says mysitename.com/wp…Ebookname.pdf

    Is it too late or is there something I can do. Where does my eBook PDF file reside if I delete it from my site?

Viewing 5 replies - 1 through 5 (of 5 total)
  • I have had great success using a robots.txt file to prevent Google from including certain files and folders. Details are here:
    https://en.wikipedia.org/wiki/Robots.txt

    I’m assuming that Google will delete files it has indexed in the past, once you put them in robots.txt

    I’d suggest putting in place something like Download Guard. A download protection script.

    Thread Starter artsuppliesreview

    (@artsuppliesreview)

    Thanks for the Robots link on Wiki. It’s a little confusing. I’m using Thesis. How would I customize this code for the URL and where would I put it?
    User-agent: *
    Disallow: /directory/file.html

    As far as Download Guard, I understand that it protects the link, but can it protect Google from “taking” the PDF file?

    Some days, I wish I was a programmer:)

    Using a text editor like NotePad, create robots.txt, then store it in the root, so that it will be found as mysitename.com/robots.txt

    I don’t know what Thesis is, but most people use an FTP program (or File Manager in cPanel on your web hosting account) to store files like robots.txt.

    In your robots.txt file, you would code:
    User-agent: *
    Disallow: /directory/book1.pdf
    Disallow: /directory/book2.pdf
    etc.
    where /directory/ is whatever is after your domain name.

    For example, if it is https://www.mysitename.com/wp/wp-content/images/book1.pdf to get to your book, then you would code:
    Disallow: /wp/wp-content/images/book1.pdf

    Don’t know about Download Guard, but Google is well behaved. If you code a robots.txt file that Disallows a file it already has, it will keep the file, but not let anyone see it.

    On the other hand, archive.org is likely to have captured the PDF file before you installed the robots.txt file. And will offer it to the world about 18 months after you first put it on your web site (with a link to it).

Viewing 5 replies - 1 through 5 (of 5 total)
  • The topic ‘Prevent Google from Stealing PDF file’ is closed to new replies.