• Resolved jimb3

    (@jimb3)


    I am being inundated with crawling traffic coming out of AWS (boardman Oregon). I can identify the CIDR (/12) but the only method available is to enter each network segment (1.2.*.*) to get that traffic excluded from my site statistics. Could a future version offer exclusion by CIDR instead? There are basically 3 different CIDR’s being used by AWS to crawl the web. They open EVERY page and “bounce” so the statistics are diluted.

    The page I need help with: [log in to see the link]

Viewing 3 replies - 1 through 3 (of 3 total)
  • Plugin Author Ben Sibley

    (@bensibley)

    Hi there,

    Thanks for getting in touch about this.

    We can look into CIDR support, but I wonder if it would be better to use something like Cloudflare so that you can completely block these bots from your website. The IP blocking in our plugin will keep this traffic out of your analytics, but your website will still get accessed by the crawler. Blocking access completely would fix the analytics and also save you the bandwidth the crawler is using. Cloudflare uses CIDR for IP blocking by default.

    Thread Starter jimb3

    (@jimb3)

    The networks are /12 CIDR and the bots can appear anywhere in there. I have tried blocking the whole network before but that /12 is also used for customer applications which included some entry points into the internet for mobile clients. I have complained to AWS/Google about their bots not being behaved but you know how that goes. The good news is they just traverse all my articles one time every few days. The bad news is I can’t really block a dynamically applied IP nor can I block them by *.aws.com lookup since that also blocks the whole CIDR. It is annoying as hell. Thanks for looking at it.

    Plugin Author Ben Sibley

    (@bensibley)

    Okay I understand now. We will look into supporting the CIDR format for IP blocking.

    I think it would still be a good idea to try out Cloudflare since they have numerous ways to block bot traffic beyond IP/CIDR blacklisting. In Independent Analytics, we do filter out traffic from bots, but only those that are self-reporting, which includes most search engine spiders. For a bot to be recorded in the analytics, it’s most likely masking its identity instead of reporting itself as a bot. Cloudflare has a firewall and “bot fight mode” that I bet would filter this traffic out without affecting your real visitors.

Viewing 3 replies - 1 through 3 (of 3 total)
  • The topic ‘Block by CIDR, not IP’ is closed to new replies.