• Resolved icemanau

    (@icemanau)


    We are receiving some 404 errors on unrecognised URLs. Currently there are 31 items. Most of these are from unknown bots. One is from Bing and one from Google.

    Here is a screenshot of some of the 404s from unknown bots:  https://imgur.com/a/jrd3KU1

    This screenshot is depicting the 404 from Bing: https://imgur.com/a/WTl33cM

    This one from Goole: https://imgur.com/a/wKUoLHP

    In Yoast settings -> Advanced -> Crawl Optimisation, I have the “Block unwanted bots” switched on: https://imgur.com/a/omC1OBe

    Can Yoast plugin could be useful to fix this issue?

    I note in Yoast -> Settings -> Advanced -> Crawl Optimisation, there is an option of “Advanced: URL clean up -> Remove unregistered URL parameters. Is this the one I need to switch on?

    Many thanks for your help.

Viewing 6 replies - 1 through 6 (of 6 total)
  • Plugin Support Maybellyne

    (@maybellyne)

    Hello @icemanau,

    Thanks for using the Yoast SEO plugin. It could be that your robots.txt file wasn’t previously following our recommended guidelines. Please share a screenshot of your robots.txt file.

    Also, blocking unwanted bots won’t take care of the URLs already found in Bing/Google. Removing unregistered URL parameters is unrelated in this case. You can learn more about that here.

    Since these URLs are not found, Google will remove them from the index if they are already indexed.

    Thread Starter icemanau

    (@icemanau)

    Hi @maybellyne,

    Thank you for coming back to me on this.

    Here is the screenshot of the robots.txt file: https://imgur.com/a/DBqhSwV

    This file has been in place for a long time. Could be years.

    However, I only switched on “Block unwanted bots” about 3 months ago.?

    One more questions I have, should I have this entry in the file?

    Disallow: /search/

    I looked at the recommended guidelines and it is bare minimum.

    Yes, Google and Bing would remove the unregistered URL parameters. There are only 2 of those anyway.

    I am concerned about the large number of these unregistered URL parameters from unknown bots.

    Thanks for your help.

    Plugin Support Mushrit Shabnam

    (@611shabnam)

    Hi,

    You asked whether you should have the directive Disallow: /search/ in your robots.txt file.

    Your question is a bit outside the scope of the support we provide. Giving SEO advice, in most cases, requires a deep analysis of your site to provide accurate advice for your specific setup, even for what may seem to be a simple question.

    Our suggestion will be following the?recommended guidelines.

    Thread Starter icemanau

    (@icemanau)

    Hi @611shabnam,

    No worries. You can disregard the question about “Disallow: /search/”.

    Would like to get support on the original question. I have shared the screenshot of the robots.txt file in the previous message.

    Thanks for your help.

    Plugin Support Maybellyne

    (@maybellyne)

    Hello @icemanau,

    Here is the screenshot of the robots.txt file

    Your robots.txt file is fine. As I mentioned previously, since these URLs return 404s, Google will remove them from the index with time.

    One more questions I have, should I have this entry in the file? Disallow: /search/

    When you toggle ON Prevent crawling of internal site search URLs in the crawl optimization settings, it adds three ‘disallow’ rules to your robots.txt file –?s=,?/search/?and?/page/*/?s=.

    I looked at the recommended guidelines and it is bare minimum.

    Yes, that’s intentional and enough for most websites. We have extra disallow rules that are added depending on the settings chosen in crawl optimization. This is for advanced users who are sure they really need it

    Plugin Support Jose Varghese

    (@josevarghese)

    This thread was marked resolved due to a lack of activity, but you’re always welcome to re-open the topic. Please read this post before opening a new request.

Viewing 6 replies - 1 through 6 (of 6 total)
  • The topic ‘Unrecognised URL 404s’ is closed to new replies.