• Hello Forum,
    since a few weeks I have goofy search requests in the log of the plugin search meter (the plugin monitors what users have searched for)

    Here is an example log:
    REQUEST_URI: /search/www.ymwears.cn
    REQUEST_METHOD: HEAD
    QUERY_STRING:
    REMOTE_ADDR: 193.112.39.200
    HTTP_USER_AGENT: User-Agent=Mozilla/5.0 (compatible;googlebot/2.1)
    HTTP_REFERER: https://www.google.com/bot.html

    What I figuered from this is that without entering my website a bot calls the URL https://www.mydomain.com/search/searchterm For what???

    I would like to either block the REQUEST_URI for bots in the htaccess or only allow it if the referer is from inside mydomain

    I think the allow solution is the better way as the bots seem to change and the list would be verry long. But so far I could not find a solution for this way.

    What I tried is to block the HTTP_USER_AGENT with this code in the htaccess but it doesn’t seem to work.

    RewriteEngine On
    RewriteCond %{HTTP_USER_AGENT} ^$ [OR]
    RewriteCond %{HTTP_USER_AGENT} Agent [NC]
    RewriteRule ^.* - [F]

    I thought that the above code would give every HTTP_USER_AGENT that has the sring Agent in his name a 403 error.

    But I still have these search requests.

    • This topic was modified 5 years, 9 months ago by dune1982.
Viewing 2 replies - 1 through 2 (of 2 total)
  • Try using Yoast SEO, which has options for what to do with the search pages and other pages that shouldn’t be indexed by search engines. You probably don’t want to block the googlebot at all.

    Who? Chinese black-hat SEO teams
    What? Use your search form hoping you will have some kind of recent searches list so they can get free backlinks.
    When? All day long.
    Where? All of your blogs.
    Why? Because black-hats are always looking for links.

    This fails with this plugin because it doesn’t create a list of searches without them having matched content in your posts.

    How to block this traffic? Put this into your .htaccess file:

    RewriteCond %{REQUEST_URI} ^/search/(.+)$ [OR]
    RewriteCond %{QUERY_STRING} s=(.+) [NC]
    RewriteCond %{REQUEST_METHOD} HEAD
    RewriteRule ^ - [F,L]

    What it does?
    The first two lines ask if this is a search and the third line asks if they are using a HEAD request. The final line blocks them if 1 or 2 is matched along with 3.

    Googlebot does NOT use HEAD requests and does not use your search form. Googlebot doesn’t submit any forms. This is why you don’t get spam comments from Googlebot submitting it.

    When you interact with a form there are only two ways to pass your query to the form: GET and POST. So no need for anyone to use your search with a HEAD request. Block them!

Viewing 2 replies - 1 through 2 (of 2 total)
  • The topic ‘Block specific referrer or agent to enter url’ is closed to new replies.