• Resolved jkilbride

    (@jkilbride)


    Hi,

    Thanks for a great plugin! I’m enjoying using it.

    I’m also using WooCommerce and have a static robots.txt file that I created which includes a few directories I don’t want crawled. Here is an example of my robots.txt:

    
    User-agent: *
    Disallow: /cart/
    Disallow: /checkout/
    Disallow: /my-account/
    Disallow: /*add-to-cart=*
    Disallow: /?s=
    Disallow: /search/
    
    Sitemap: https://thedadlands.io/sitemap.xml
    

    However, I’m getting warnings from Google that say “Sitemap contains urls which are blocked by robots.txt.” Is there any way to add exclusions to the generated sitemap.xml file?

    Thanks!

Viewing 2 replies - 1 through 2 (of 2 total)
  • Plugin Author Sybre Waaijer

    (@cybr)

    Hi @jkilbride,

    On the following pages, you should apply noindex under the “Visibility” tab.

    cart
    my-account
    checkout

    This works because the noindex option not only tells search engines not to index the page, but it also removes it from the sitemap.

    With that, the errors can be marked as resolved in the Search Console, and they shouldn’t come back.

    I also recommend adding these two lines under User-agent: *:

    Disallow: /wp-admin/
    Allow: /wp-admin/admin-ajax.php
    

    It tells search engines not to crawl anything from the wp-admin directory, whilst allowing WordPress’ Ajax functionality, which themes and plugins might utilize on the front-end.

    Cheers ??

    Thread Starter jkilbride

    (@jkilbride)

    Hi @cybr,

    Great, thank you!

Viewing 2 replies - 1 through 2 (of 2 total)
  • The topic ‘Remove robots.txt blocked URLs from sitemap’ is closed to new replies.