Rotiri gratuite fara depunere fara verificare.Makakuha ng libreng 700pho sa bawat deposito

Resolved kumawathemant032
(@kumawathemant032)

2 years ago

Hello Folks,

I operate a PDF Sharing site and using Amazon S3 to deliver PDF files. My issue is, google is directly indexing PDF files and showing them directly to search results. I tried to use X-Robots-Tag but they are not working properly. Check these 2 URL:
On Hosting Server: https://latestpdf.com/wp-content/uploads/2022/11/Venus-Tools-Product-Price-List.pdf
Amazon CDN: https://cdn.latestpdf.com/wp-content/uploads/2022/11/05130115/Venus-Tools-Product-Price-List.pdf

I want to enable Noindex for all PDFs in bulk for both locations.

Viewing 4 replies - 1 through 4 (of 4 total)

Plugin Support devnihil
(@devnihil)

2 years ago
@kumawathemant032 Thanks for your message.

We aren’t sure of what exact way you tried using the X-Robots tag to implement this, but that is the exact way we would recommend. How to implement this successfully can vary upon server configuration, so unfortunately we can say whether a code snippet will definitely work for you.

Especially since implementing the code to make this change would be placed in the .htaccess file (for Apache), which can be a very delicate file to edit since even having a single character not correct has the potential to break your entire site, so.

If you are having problems with getting the X-Robots tag successfully inserting a noindex into the headers of the pdfs, we’d recommend contacting your hosting provider for additional support as they’ll know best which is the best code or method for setting this, as they’ll be familiar with the server configuration.

That being said, I tried a few suggestions for the code to use in .htaccess, such as in the examples found here, and I personally had success with the following ones.

In this example, all .pdf files had the X-Robots tag set to noindex:
```
<FilesMatch "\.pdf$">
 header set x-robots-tag: noindex
</FilesMatch>
```
In this example, it allowed for setting multiple file types to noindex, with the noindex tag also including noarchive and nosnippet as well:
```
<FilesMatch ".(doc|pdf|jpg)$">
Header set X-Robots-Tag "noindex, noarchive, nosnippet"
</FilesMatch>
```
Again, if these examples don’t work for you, it’s not something we’d be able to troubleshoot as whether they are successful can depend on server configuration.

Also, one last thing, when testing whether the noindex has been successfully added to the header, we’d recommend using terminal/command prompt to verify this. For example, I use the following command to check the file:

curl -I https://latestpdf.com/wp-content/uploads/2022/11/Venus-Tools-Product-Price-List.pdf
Thread Starter kumawathemant032
(@kumawathemant032)

2 years ago

Thanks for such a detailed response, but it solved half of the problem. Noindex is enabled using x-robots-tag for example.com/abc.pdf which is the root domain on an apache server. But to actually deliver PDF Files, I am using CDN (Amazon S3) such as cdn.example.com/abc.pdf.

PDFs on this sub-domain are still indexable. Robots.txt file and x-robots-tag both are ineffective for the subdomain.

scruffy1
(@scruffy1)

2 years ago

@kumawathemant032
We had a similar problem, I cannot remember how we did it, but there is a way within the AWS control panel to achieve what you want.
Cheers

Thread Starter kumawathemant032
(@kumawathemant032)

2 years ago

Can you share more details, how you solved this issue?

Viewing 4 replies - 1 through 4 (of 4 total)

The topic ‘Enable Noindex for all PDF Files’ is closed to new replies.

Tags