• Resolved rob212

    (@rob212)


    I installed StopBadBots about a week ago and checked the settings to block all the bad bots in the table. But according to my Raw Access Log, a number of these bad bots are still accessing my website at extremely high rates, including Baiduspider and SemRush, for example. Is there some reason that I don’t understand that these bots would not be blocked even though they are on the table list to be blocked?

Viewing 12 replies - 1 through 12 (of 12 total)
  • Plugin Author Bill Minozzi

    (@sminozzi)

    Hi Rob212,

    Please, check the settings page:
    Dashboard => Stop Bad Bots => Settings

    if this 2 buttons are marked yes:
    Block all Bots included at Bad Bots Table?

    Block all IPs included at Bad IPs Table?

    Check also if the:

    Dashboad => Stop Bad Bots => Dashboard
    show the number of bots blocked.

    If you want run a test, go to linux root window and write
    wget yoursite.com <= replace yoursite.com with your site address

    You will see the size of the page is 0 (bytes)

    Then, means the bot try to open your page and get a blank screen.
    This save your bandwidth and they will give up of your site because doesn’t work for them. They are unable to steal your data and look for vulnerability.

    Happy new year!

    Cheers,
    Bill

    Plugin Author Bill Minozzi

    (@sminozzi)

    Hi Rob212

    Our previous answer was held for moderation by WordPress automated system and will be manually reviewed by a moderator.

    In this mean time, please, visit our faq page and read the last faq question about it.
    https://stopbadbots.com/faq/

    Happy new year!

    Cheers,
    Bill

    Anonymous User 14978628

    (@anonymized-14978628)

    This is interesting as I noticed that stopbadbots blocked Semrushbot which i have also in my htaccess.

    So not sure why it wasn’t blocked by my htaccess, but the plugin says it blocked it. Though i haven’t checked my raw access logs to confirm.

    Plugin Author Bill Minozzi

    (@sminozzi)

    Hi Marty,

    We are unable to avoid to bot request to open your page.
    But, we can show a blank screen, then you can save bandwidth and probably they will give up of your site.

    Cheers,
    Bill

    • This reply was modified 5 years, 11 months ago by Bill Minozzi.

    I’m having the same trouble. Awstats reports many many bots that should have been blocked are accessing the site and consuming bandwidth. A small example from the Bad Bots Table:

    Baiduspider Baidu Enabled Blocked 0
    Baiduspider-image Baiduspider-image Enabled Blocked 0
    Baiduspider-video Baiduspider-video/1.1 Enabled Blocked 0

    Awstats reports:
    Baiduspider Hits 792+9 Bandwidth 8.56 MB Last visit: 14 Jan 2019 – 11:11
    Baiduspider-image Hits 227 Bandwidth 15.65 MB 13 Jan 2019 – 16:34

    Bad Bots reports MJ12bot blocked 9710 times, but Awstats reports 1355 hits and 56Mb bandwidth.

    And so many others…

    Plugin Author Bill Minozzi

    (@sminozzi)

    Hi,

    When the user agent contain the nickname, our system show one empty screen.

    Then, the bandwidth spend it is minimized.

    Some possibilities:

    Try to disable our plugin for a whole day and control it (bandwidth usage).

    Our plugin can protect only wordpress files. Check if you don’t have non wordpress files in your site. Check if this bots not is hitting robot.txt, sitemap.xml and others files.

    Try to control your bandwidth also with another tool, maybe WHM panel.

    Double check if our plugin it is enabled. Try to put notifications on and control for a couple of minutes emails in your inbox.

    Check also if you have the last version installed.

    Check also if you don’t have another anti bot or anti hacker plugin installed and this plugin not accept the bots before us.

    You can use linux wget to test if our plugin it is working. Try to access your site with wget from a linux machine.

    Cheers,
    Bill

    Anonymous User 14978628

    (@anonymized-14978628)

    @digbymaass try adding this to your root htaccess. Please let us know if it helps:

    # Bad Bots
    <IfModule mod_rewrite.c>
    RewriteCond %{HTTP_USER_AGENT} ^.*(linkdexbot/2.1|linkdexbot/2.1|Gigabot/3.0|CatchBot/2.0|CatchBot/1.0|CCBot/2.0|CCBot/1.0).*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*(Scrapy|PhantomJS|SiteExplorer|linkCheck|python-requests|Thumbshots|Cliqzbot|IDG|IDG/IT|adbeat|qwant.com|Photon|KOCMOHABT|HEADMasterSEO|AppEngine|WebImages|SMUrlExpander|Blackboard\ Safeassign|PrintFriendly.com|Mojolicious|infegy|Integrity|Nuzzel|MagpieRSS|Salis|Phantom|linkfluence|Kalisto|VeBot|Daedulus|spbot|Oya|Belenus|BUbiNG|Pinterest|FeedBurner|Feedfetcher|ADmantX|ubermetrics-technologies|seznam|SeznamBot|autocite|Ruby|cmcm|Pioneer|Baiduspider|Baidu|YandexBot|Wget|apache|leacher|1Noonbot|1on1searchBot|3D_SEARCH|3DE_SEARCH2|3GSE|50.nu|192.comAgent|360Spider|A6-Indexer|AASP|ABACHOBot|Abonti|abot|AbotEmailSearch|Aboundex|AboutUsBot|AccMonitor\ Compliance|accoona|AChulkov.NET\ page\ walker|Acme.Spider|AcoonBot|acquia-crawler|ActiveTouristBot|Acunetix|Ad\ Muncher|AdamM|adbeat_bot|adminshop.com|Advanced\ Email|AESOP_com_SpiderMan|AESpider|AF\ Knowledge\ Now\ Verity|aggregator:Vocus|ah-ha.com|AIBOT|aiHitBot|aipbot|AISIID|AITCSRobot|Akamai-SiteSnapshot|AlexaWebSearchPlatform|AlexaBot|AlexfDownload|Alexibot|AlkalineBOT|All\ Acronyms|Amfibibot|AmPmPPC.com|AMZNKAssocBot|Anemone|Anonymous|Anonymouse.org|AnotherBot|AnswerBot|AnswerBus|AnswerChase\ PROve|AntBot|antibot-|AntiSantyWorm|Antro.Net|AONDE-Spider|Aport|Aqua_Products|AraBot|Arachmo|Arachnophilia|archive.org_bot|aria\ eQualizer|aria2|arianna.libero.it|Arikus_Spider|Art-Online.com|ArtavisBot|Artera|ASpider|ASPSeek|asterias|AstroFind|athenusbot|AtlocalBot|Atomic_Email_Hunter|attach|attrakt|attributor|Attributor.comBot|augurfind|AURESYS|AutoBaron|autoemailspider|autowebdir|AVSearch-|axfeedsbot|Axonize-bot|Ayna|b2w|BackDoorBot|BackRub|BackStreet\ Browser|BackWeb|Baiduspider-video|Bandit|BatchFTP|baypup|BDFetch|BecomeBot|BecomeJPBot|BeetleBot|Bender|besserscheitern-crawl|betaBot|Big\ Brother|Big\ Data|Bigado.com|BigCliqueBot|Bigfoot|BIGLOTRON|Bilbo|BilgiBetaBot|BilgiBot|binlar|bintellibot|bitlybot|BitvoUserAgent|bixocrawler|Bizbot003|BizBot04|BizBot04\ kirk.overleaf.com|Black.Hole|Black\ Hole|Blackbird|BlackWidow|bladder\ fusion|Blaiz-Bee|BLEXBot|Blinkx|BlitzBOT|Blog\ Conversation\ Project|BlogMyWay|BlogPulseLive|BlogRefsBot|BlogScope|Blogslive|BloobyBot|BlowFish|BLT|bnf.fr_bot|BoaConstrictor|BoardReader-Image-Fetcher|BOI_crawl_00|BOIA-Scan-Agent|BOIA.ORG-Scan-Agent|boitho.com-dc|Bookmark\ Buddy|bosug|Bot\ Apoena|BotALot|BotRightHere|Botswana|bottybot|BpBot|BRAINTIME_SEARCH|BrokenLinkCheck.com|BrowserEmulator|BrowserMob|BruinBot|BSearchR&D|BSpider|btbot|Btsearch|Buddy|Buibui|BuildCMS|BuiltBotTough|Bullseye|bumblebee|BunnySlippers|BuscadorClarin|Butterfly|BuyHawaiiBot|BuzzBot|byindia|BySpider|byteserver|bzBot|c\ r\ a\ w\ l\ 3\ r|CacheBlaster|CACTVS\ Chemistry|Caddbot|Cafi|Camcrawler|CamelStampede|Canon-WebRecord|Canon-WebRecordPro|CareerBot|casper|cataguru|CatchBot|CazoodleBot|CCBot|CCGCrawl|ccubee|CD-Preload|CE-Preload|Cegbfeieh|Cerberian\ Drtrs|CERT\ FigleafBot|cfetch|CFNetwork|Chameleon|ChangeDetection|Charlotte|Check&Get|Checkbot|Checklinks|checkprivacy|CheeseBot|ChemieDE-NodeBot|CherryPicker|CherryPickerElite|CherryPickerSE|Chilkat|ChinaClaw|CipinetBot|cis455crawler|citeseerxbot|cizilla.com|ClariaBot|clshttp|Clushbot|cmsworldmap|coccoc|CollapsarWEB|Collector|combine|conceptbot|ConnectSearch|conpilot|ContentSmartz|ContextAd|contype|cookieNET|CoolBot???|CoolCheck|Copernic|Copier|CopyRightCheck|core-project|cosmos|Covario-IDS|Cowbot-|Cowdog|crabbyBot|crawl|Crawl_Application|crawl.UserAgent|CrawlConvera|crawler|crawler_for_infomine|CRAWLER-ALTSE.VUNET.ORG-Lynx|crawler-upgrade-config|crawler.kpricorn.org|crawler@|crawler4j|crawler43.ejupiter.com|Crawly|CreativeCommons|Crescent|Crescent\ Internet\ ToolPak\ HTTP\ OLE\ Control|cs-crawler|CSE\ HTML\ Validator|CSHttpClient|Cuasarbot|culsearch|Curl|Custo|Cutbot|cvaulev|Cyberdog|CyberNavi_WebGet|CyberSpyder|CydralSpider).*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*(D1GArabicEngine|DataCha0s|DataFountains|DataparkSearch|DataSpearSpiderBot|DataSpider|Dattatec.com|Dattatec.com-Sitios-Top|Daumoa|DAUMOA-video|DAUMOA-web|Declumbot|Deepindex|deepnet|DeepTrawl|dejan|del.icio.us-thumbnails|DelvuBot|Deweb|DiaGem|Diamond|DiamondBot|diavol|DiBot|didaxusbot|DigExt|Digger|DiGi-RSSBot|DigitalArchivesBot|DigOut4U|DIIbot|Dillo|Dir_Snatch.exe|DISCo|DISCo\ Pump|discobot|DISCoFinder|Distilled-Reputation-Monitor|Dit|DittoSpyder|DjangoTraineeBot|DKIMRepBot|DoCoMo|DOF-Verify|domaincrawler|DomainScan|DomainWatcher|dotbot|DotSpotsBot|Dow\ Jonesbot|Download|Download\ Demon|Downloader|DOY|dragonfly|Drip|drone|DTAAgent|dtSearchSpider|dumbot|Dwaar|Dwaarbot|DXSeeker|EAH|EasouSpider|EasyDL|ebingbong|EC2LinkFinder|eCairn-Grabber|eCatch|eChooseBot|ecxi|EdisterBot|EduGovSearch|egothor|eidetica.com|EirGrabber|ElisaBot|EllerdaleBot|EMail\ Exractor|EmailCollector|EmailLeach|EmailSiphon|EmailWolf|EMPAS_ROBOT|EnaBot|endeca|EnigmaBot|Enswer\ Neuro|EntityCubeBot|EroCrawler|eStyleSearch|eSyndiCat|Eurosoft-Bot|Evaal|Eventware|Everest-Vulcan|Exabot|Exabot-Images|Exabot-Test|Exabot-XXX|ExaBotTest|ExactSearch|exactseek.com|exooba|Exploder|explorersearch|extract|Extractor|ExtractorPro|EyeNetIE|ez-robot|Ezooms|factbot|FairAd\ Client|falcon|Falconsbot|fast-search-engine|FAST\ Data\ Document|FAST\ ESP|fastbot|fastbot.de|FatBot|Favcollector|Faviconizer|FDM|FedContractorBot|feedfinder|FelixIDE|fembot|fetch_ici|Fetch\ API\ Request|fgcrawler|FHscan|fido|Filangy|FileHound|FindAnISP.com_ISP_Finder|findlinks|FindWeb|Firebat|Fish-Search-Robot|Flaming\ AttackBot|Flamingo_SearchEngine|FlashCapture|FlashGet|flicky|FlickySearchBot|flunky|focused_crawler|FollowSite|Foobot|Fooooo_Web_Video_Crawl|Fopper|FormulaFinderBot|Forschungsportal|fr_crawler|Francis|Freecrawl|FreshDownload|freshlinks.exe|FriendFeedBot|frodo.at|froGgle|FrontPage|Froola|FU-NBI|full_breadth_crawler|FunnelBack|FunWebProducts|FurlBot|g00g1e|G10-Bot|Gaisbot|GalaxyBot|gazz|gcreep|generate_infomine_category_classifiers|genevabot|genieBot|GenieBotRD_SmallCrawl|Genieo|Geomaxenginebot|geometabot|GeonaBot|GeoVisu|GermCrawler|GetHTMLContents|Getleft|GetRight|GetSmart|GetURL.rexx|GetWeb!|Giant|GigablastOpenSource|Gigabot|Girafabot|GleameBot|gnome-vfs|Go-Ahead-Got-It|Go!Zilla|GoForIt.com|GOFORITBOT|gold|Golem|GoodJelly|Gordon-College-Google-Mini|goroam|GoSeebot|gotit|Govbot|GPU\ p2p|grab|Grabber|GrabNet|Grafula|grapeFX|grapeshot|GrapeshotCrawler|grbot|GreenYogi\ [ZSEBOT]|Gromit|GroupMe|GroupHigh|grub|grub-client|Grubclient-|GrubNG|GruBot|GSLFbot|GT::WWW|Gulliver|GulperBot|GurujiBot|GVC|GVC\ BUSINESS|gvcbot.com|HappyFunBot|harvest|HarvestMan|Hatena\ Antenna|Hawler|Hazel's\ Ferret\ hopper|hcat|hclsreport-crawler|HD\ nutch\ agent|Header_Test_Client|healia|Helix|hijbul-heritrix-crawler|HiScan|HiSoftware\ AccMonitor|HiSoftware\ AccVerify|hitcrawler_|hivaBot|hloader|HMSEbot|HMView|hoge|holmes|HomePageSearch|Hooblybot-Image|HooWWWer|Hostcrawler|HSFT\ -\ Link|HSFT\ -\ LVU|HSlide|ht:|htdig|Html\ Link\ Validator|HTMLParser|HTTP::Lite|httplib|HTTrack|Huaweisymantecspider|hul-wax|humanlinks|HyperEstraier|Hyperix).*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*(ia_archiver|IAArchiver-|ibuena|iCab|ICDS-Ingestion|ichiro|iCopyright\ Conductor|id-search|IDBot|IEAutoDiscovery|IECheck|iHWebChecker|IIITBOT|iim_405|IlseBot|IlTrovatore|Iltrovatore-Setaccio|ImageBot|imagefortress|ImagesHereImagesThereImagesEverywhere|ImageVisu|imds_monitor|imo-google-robot-intelink|IncyWincy|Industry\ Cortexcrawler|indylabs_marius|InelaBot|Inet32\ Ctrl|inetbot|InfoLink|INFOMINE|infomine.ucr.edu|InfoNaviRobot|Informant|Infoseek|InfoTekies|InfoUSABot|INGRID|Inktomi|InsightsCollector|InsightsWorksBot|InspireBot|InsumaScout|Intelix|InterGET|Internet\ Ninja|InternetLinkAgent|Interseek|IOI|ip-web-crawler.com|IPAdd|Ipselonbot|Iria|IRLbot|Iron33|Isara|iSearch|iSiloX|IsraeliSearch|IstellaBot|its-learning|IU_CSCI_B659_class_crawler|iVia|iVia\ Page\ Fetcher|JadynAve|JadynAveBot|jakarta|Java|Jbot|JemmaTheTourist|JennyBot|Jetbot|JetBrains\ Omea\ Pro|JetCar|Jim|JoBo|JobSpider_BA|JOC|JoeDog|JoyScapeBot|JSpyda|JubiiRobot|jumpstation|Junut|JustView|Jyxobot|K.S.Bot|KakcleBot|kalooga|KaloogaBot|kanagawa|KATATUDO-Spider|Katipo|kbeta1|Kenjin.Spider|KeywenBot|Keyword.Density|Keyword\ Density|kinjabot|KIT-Fireball|Kitenga-crawler-bot|KiwiStatus|kmbot-|kmccrew|Knight|KnowItAll|Knowledge.com|Knowledge\ Engine|KoepaBot|Koninklijke|KrOWLer|KSbot|kuloko-bot|kulturarw3|KummHttp|Kurzor|Kyluka|L.webis|LabelGrab|Labhoo|labourunions411|lachesis|Lament|LamerExterminator|LapozzBot|larbin|LARBIN-EXPERIMENTAL|LBot|LBBROWSER|LeapTag|LeechFTP|LeechGet|LetsCrawl.com|LexiBot|LexxeBot|lftp|libcrawl|libiViaCore|libWeb|libwww|libwww-perl|likse|Linguee|link_checker|LinkAlarm|linkbot|LinkCheck\ by\ Siteimprove.com|LinkChecker|linkdexbot|linkdex.com|LinkextractorPro|LinkLint|linklooker|Linkman|LinkScan|LinksCrawler|LinksManager.com_bot|LinkSweeper|linkwalker|LiteFinder|LitlrBot|Little\ Grabber\ at\ Skanktale.com|Livelapbot|LM\ Harvester|LMQueueBot|LNSpiderguy|LoadTimeBot|LocalcomBot|locust|LolongBot|LookBot|Lsearch|lssbot|LWP|lwp-request|lwp-trivial|LWP::Simple|Lycos_Spider|Lydia\ Entity|LynnBot|Lytranslate|Mag-Net|Magnet|magpie-crawler|Magus|Mail.Ru|Mail.Ru_Bot|MAINSEEK_BOT|Mammoth|MarkWatch|MaSagool|masidani_bot_|Mass|Mata.Hari|Mata\ Hari|matentzn\ at\ cs\ dot\ man\ dot\ ac\ dot\ uk|maxamine.com--robot|maxamine.com-robot|maxomobot|Maxthon$|McBot|MediaFox|medrabbit|Megite|MemacBot|Memo|MendeleyBot|Mercator-|mercuryboard_user_agent_sql_injection.nasl|MerzScope|metacarta|Metager2|metager2-verification-bot|MetaGloss|METAGOPHER|metal|metaquerier.cs.uiuc.edu|METASpider|Metaspinner|MetaURI|MetaURI\ API|MFC_Tear_Sample|MFcrawler|MFHttpScan|Microsoft.URL|MIIxpc|miner|mini-robot|minibot|miniRank|Mirror|Missigua\ Locator|Mister.PiX|Mister\ PiX|Miva|MJ12bot|mnoGoSearch|mod_accessibility|moduna.com|moget|MojeekBot|MOMspider|MonkeyCrawl|MOSES|Motor|mowserbot|MQbot|MSE360|MSFrontPage|MSIECrawler|MSIndianWebcrawl|MSMOBOT|Msnbot|msnbot-products|MSNPTC|MSRBOT|MT-Soft|MultiText|My_Little_SearchEngine_Project|my-heritrix-crawler|MyApp|MYCOMPANYBOT|mycrawler|MyEngines-US-Bot|MyFamilyBot|Myra|nabot|nabot_|Najdi.si|Nambu|NAMEPROTECT|NatchCVS|naver|naverbookmarkcrawler|NaverBot|Navroad|NearSite|NEC-MeshExplorer|NeoScioCrawler|NerdByNature.Bot|NerdyBot|Nerima-crawl-).*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*(Nessus|NESSUS::SOAP|nestReader|Net::Trackback|NetAnts|NetCarta\ CyberPilot\ Pro|Netcraft|NetID.com|NetMechanic|Netprospector|NetResearchServer|NetScoop|NetSeer|NetShift=|NetSongBot|Netsparker|NetSpider|NetSrcherP|NetZip|NetZip-Downloader|NewMedhunt|news|News_Search_App|NewsGatherer|Newsgroupreporter|NewsTroveBot|NextGenSearchBot|nextthing.org|NHSEWalker|nicebot|NICErsPRO|niki-bot|NimbleCrawler|nimbus-1|ninetowns|Ninja|NjuiceBot|NLese|Nogate|Nomad-V2.x|NoteworthyBot|NPbot|NPBot-|NRCan\ intranet|NSDL_Search_Bot|nu_tch-princeton|nuggetize.com|nutch|nutch1|NutchCVS|NutchOrg|NWSpider|Nymesis|nys-crawler|ObjectsSearch|oBot|Obvius\ external\ linkcheck|Occam|Ocelli|Octopus|ODP\ entries|Offline.Explorer|Offline\ Explorer|Offline\ Navigator|OGspider|OmiExplorer_Bot|OmniExplorer_Bot|omnifind|OmniWeb|OnetSzukaj|online\ link\ validator|OOZBOT|Openbot|Openfind|Openfind\ data|OpenHoseBot|OpenIntelligenceData|OpenISearch|OpenSearchServer_Bot|OpiDig|optidiscover|OrangeBot|ORISBot|ornl_crawler_1|ORNL_Mercury|osis-project.jp|OutfoxBot|OutfoxMelonBot|OWLER-BOT|Owlin|owsBot|ozelot|P3P\ Client|page_verifier|PageBitesHyperBot|Pagebull|PageDown|PageFetcher|PageGrabber|PagePeeker|PageRank\ Monitor|pamsnbot.htm|Panopy|panscient.com|Pansophica|Papa\ Foto|PaperLiBot|parasite|parsijoo|Pathtraq|Pattern|Patwebbot|pavuk|PaxleFramework|PBBOT|pcBrowser|pd-crawler|PECL::HTTP|Pcore-HTTP|penthesila|PeoplePal|perform_crawl|PerMan|PGP-KA|PHPCrawl|PhpDig|PicoSearch|pipBot|pipeLiner|Pita|pixfinder|PiyushBot|planetwork|PleaseCrawl|Plucker|Plukkie|Plumtree|Pockey|Pockey-GetHTML|PoCoHTTP|pogodak.ba|Pogodak.co.yu|Poirot|polybot|Pompos|Poodle\ predictor|PopScreenBot|PostPost|PrivacyFinder|ProjectWF-java-test-crawler|ProPowerBot|ProWebWalker|psbot|psbot-page|PSS-Bot|psycheclone|pub-crawler|pucl|pulseBot\ \(pulse|Pump|purebot|PWeBot|pycurl|Python-urllib|pythonic-crawler|PythonWikipediaBot|q1|QEAVis\ agent|QFKBot|qualidade|Qualidator.com|QuepasaCreep|QueryN.Metasearch|QueryN\ Metasearch|quest.durato|Quintura-Crw|QunarBot|Qweery_robot.txt_CheckBot|QweeryBot|r2iBot|R6_CommentReader|R6_FeedFetcher|R6_VoteReader|RaBot|Radian6|radian6_linkcheck|RAMPyBot|RankurBot|RcStartBot|RealDownload|Reaper|REBI-shoveler|Recorder|RedBot|RedCarpet|ReGet|RepoMonkey|RepoMonkey\ Bait|Riddler|RIIGHTBOT|RiseNetBot|RiverglassScanner|RoboPal|Robosourcer|robot|robotek|robots|Robozilla|rogerBot|Rome\ Client|Rondello|Rotondo|Roverbot|RPT-HTTPClient|rtgibot|RufusBot|Runnk\ online\ rss\ reader|SafetyNet|s~stremor-crawler|S2Bot|SafariBookmarkChecker|SaladSpoon|Sapienti|SBIder|SBL-BOT|SCFCrawler|Scich|ScientificCommons.org|ScollSpider|ScooperBot|Scooter|ScoutJet|ScrapeBox|Scrapy|SCrawlTest|Scrubby|scSpider|Scumbot|SeaMonkey$|Search-Channel|Search-Engine-Studio|search.KumKie.com|search.msn.com|search.updated.com|search.usgs.gov|Search\ Publisher|Searcharoo.NET|SearchBlox|searchbot|searchengine|searchhippo.com|SearchIt-Bot|searchmarking|searchmarks|searchmee_v|SearchmetricsBot|searchmining|SearchnowBot_v1|searchpreview|SearchSpider.com|SearQuBot|Seekbot|Seeker.lookseek.com|SeeqBot|seeqpod-vertical-crawler|Selflinkchecker|Semager|semanticdiscovery|Semantifire1|semisearch|SemrushBot|Senrigan|SEOENGWorldBot|ShablastBot|ShadowWebAnalyzer|Shareaza|Shelob|sherlock|ShopWiki|ShowLinks|ShowyouBot|siclab|silk|Siphon|SiteArchive|SiteCheck-sitecrawl|sitecheck.internetseer.com|SiteFinder|SiteGuardBot|SiteOrbiter|SiteSnagger|SiteSucker|SiteSweeper|SiteXpert|SkimBot|SkimWordsBot|SkreemRBot|skygrid|Skywalker|Sleipnir|slow-crawler|SlySearch|smart-crawler|SmartDownload|Smarte|smartwit.com|Snake|Snapbot|SnapPreviewBot|Snappy|snookit|Snooper|Snoopy|SocialSearcher|SocSciBot|SOFT411\ Directory|sogou|sohu-search|sohu\ agent|Sokitomi|Solbot|sootle|Sosospider|Space\ Bison|Space\ Fung|SpaceBison|SpankBot|spanner|Spatineo\ Monitor\ Controller|special_archiver|SpeedySpider|Sphider|Sphider2|spider|Spider.TerraNautic.net|SpiderEngine|SpiderKU|SpiderMan|Spinn3r|Spinne|sportcrew-Bot|spyder3.microsys.com|sqlmap|Squid-Prefetch|SquidClamAV_Redirector|Sqworm|SrevBot|sslbot|SSM\ Agent|StackRambler|StarDownloader|statbot|statcrawler|statedept-crawler|Steeler|STEGMANN-Bot|stero|Stripper|Stumbler|suchclip|sucker|SumeetBot|SumitBot|SummizeBot|SummizeFeedReader|SuperBot|superbot.com|SuperHTTP|SuperLumin|SuperPagesBot|Supybot|SURF|Surfbot|SurfControl|SurveyBot|suzuran|SWEBot|swish-e|SygolBot|SynapticWalker|Syntryx\ ANT\ Scout\ Chassis\ Pheromone|SystemSearch-robot|Szukacz).*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*(T-H-U-N-D-E-R-S-T-O-N-E|Tailrank|tAkeOut|TAMU_CRAWLER|TapuzBot|Tarantula|targetblaster.com|TargetYourNews.com|TAUSDataBot|taxinomiabot|Tecomi|TeezirBot|Teleport|Teleport\ Pro|TeleportPro|Telesoft|Teradex\ Mapper|TERAGRAM_CRAWLER|TerrawizBot|testbot|testing\ of|TextBot|thatrobotsite.com|The.Intraformant|The\ Dyslexalizer|The\ Intraformant|TheNomad|Theophrastus|theusefulbot|TheUsefulbot_|ThumbBot|thumbshots-de-bot|tigerbot|TightTwatBot|TinEye|Titan|to-dress_ru_bot_|to-night-Bot|toCrawl|Topicalizer|topicblogs|Toplistbot|TopServer\ PHP|topyx-crawler|Touche|TourlentaScanner|TPSystem|TRAAZI|TranSGeniKBot|travel-search|TravelBot|TravelLazerBot|Treezy|TREX|TridentSpider|Trovator|True_Robot|tScholarsBot|TsWebBot|TulipChain|turingos|turnit|TurnitinBot|TutorGigBot|TweetedTimes|TweetmemeBot|TwengaBot|TwengaBot-Discover|Twiceler|Twikle|twinuffbot|Twisted\ PageGetter|Twitturls|Twitturly|TygoBot|TygoProwler|Typhoeus|U.S.\ Government\ Printing\ Office|uberbot|ucb-nutch|UCSD-Crawler|UdmSearch|UFAM-crawler-|Ultraseek|UnChaos|unchaos_crawler_|UnisterBot|UniversalSearch|UnwindFetchor|UofTDB_experiment|updated|URI::Fetch|url_gather|URL-Checker|URL\ Control|URLAppendBot|URLBlaze|urlchecker|urlck|UrlDispatcher|urllib|URLSpiderPro|URLy.Warning|USAF\ AFKN\|usasearch|USS-Cosmix|USyd-NLP-Spider|Vacobot|Vacuum|VadixBot|Vagabondo|Validator|Valkyrie|vBSEO|VCI|Vegi\ bot|VerbstarBot|VeriCiteCrawler|Verifactrola|Verity-URL-Gateway|vermut|versus|versus.integis.ch|viasarchivinginformation.html|vikspider|VIP|VIPr|virus-detector|VisBot|Vishal\ For\ CLIA|VisWeb|vlad|vlsearch|VMBot|VocusBot|VoidEYE|VoilaBot|Vortex|voyager|voyager-hc|voyager-partner-deep|VSE|vspider).*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*(W3C_Unicorn|W3C-WebCon|w3m|w3search|wacbot|wastrix|WatzBot|wauuu\ engine|Wavefire|Waypath|Wazzup|Wazzup1.0.4800|wbdbot|web-agent|Web-Sniffer|Web.Image.Collector|Web\ Link\ Validator|Web\ Magnet|webalta|WebaltBot|WebAuto|webbandit|webbot|webbul-bot|WebCapture|webcheck|Webclipping.com|webcollage|WebCopier|WebCopy|WebCorp|webcrawl.net|webcrawler|WebDownloader\ for|Webdup|WebEMailExtrac|WebEMailExtrac.*|WebEnhancer|WebFerret|webfetch|WebFetcher|WebGather|WebGo\ IS|webGobbler|WebImages|Webinator-search2.fasthealth.com|Webinator-WBI|WebIndex|WebIndexer|weblayers|WebLeacher|WeblexBot|WebLinker|webLyzard|WebmasterCoffee|WebmasterWorld|WebmasterWorldForumBot|WebMiner|WebMoose|WeBot|WebPix|WebReaper|WebRipper|WebSauger|Webscan|websearchbench|WebSite|websitemirror|WebSpear|websphinx.test|WebSpider|Webster|Webster.Pro|Webster\ Pro|WebStripper|WebTrafficExpress|WebTrends\ Link\ Analyzer|webvac|webwalk|WebWalker|Webwasher|WebWatch|WebWhacker|WebXM|WebZip|Weddings.info|wenbin|WEPA|WeRelateBot|Whacker|whisper|Widow|WikiaBot|Wikio|wikiwix-bot-|WinHttp.WinHttpRequest|WinHTTP\ Example|WIRE|wired-digital-newsbot|WISEbot|WISENutbot|wish-la|wish-project|wisponbot|WMCAI-robot|wminer|WMSBot|woriobot|worldshop|WorQmada|Wotbox|WPScan|wume_crawler|WWW-Mechanize|www.freeloader.com.|WWW\ Collector|WWWOFFLE|wwwrobot|wwwster|WWWWanderer|wwwxref|Wysigot|X-clawler|Xaldon|Xerka\ MetaBot|XGET|xirq|XmarksFetch|XoviBot|xqrobot|Y!J|Y!TunnelPro|yacy.net|yacybot|yarienavoir.net|Yasaklibot|yBot|YebolBot|yellowJacket|yes|YesupBot|Yeti|YioopBot|YisouSpider|yolinkBot|yoogliFetchAgent|yoono|Yoriwa|YottaCars_Bot|you-dir|Z-Add\ Link|zagrebin|Zao|zedzo.digest|zedzo.validate|zermelo|Zeus|Zeus\ Link\ Scout|zibber-v|zimeno|Zing-BottaBot|ZipppBot|zmeu|ZoomSpider|ZuiBot|ZumBot|Zyborg|Zyte).*$ [NC]
    RewriteRule .* - [F,L]
    </IfModule>

    and for blocking amazon aws bots if an issue for you:

    # START Block IPs
    Order allow,deny
    allow from all
    # Amazon AWS
    deny from 13.32.0.0/15 13.52.0.0/16 13.54.0.0/15 13.56.0.0/16 13.57.0.0/16 13.58.0.0/15 13.112.0.0/14 13.124.0.0/16 13.125.0.0/16 13.126.0.0/15 13.209.0.0/16 13.210.0.0/15 13.228.0.0/15 13.230.0.0/15 13.232.0.0/14 13.236.0.0/14 13.250.0.0/15 18.144.0.0/15 18.194.0.0/15 18.196.0.0/15 18.200.0.0/16 18.216.0.0/14 18.220.0.0/14 18.231.0.0/16 18.253.0.0/16 23.20.0.0/14 27.0.0.0/22 34.192.0.0/10 34.192.0.0/12 34.208.0.0/12 34.224.0.0/12 34.240.0.0/13 34.248.0.0/13 35.153.0.0/16 35.154.0.0/16 35.155.0.0/16 35.156.0.0/14 35.160.0.0/13 35.168.0.0/13 35.176.0.0/15 35.178.0.0/15 35.180.0.0/15 35.182.0.0/15 43.250.192.0/24 43.250.193.0/24 46.51.128.0/18 46.51.192.0/20 46.51.216.0/21 46.51.224.0/19 46.137.0.0/17 46.137.128.0/18 46.137.192.0/19 46.137.224.0/19 50.16.0.0/15 50.18.0.0/16 50.19.0.0/16 50.112.0.0/16 52.0.0.0/15 52.2.0.0/15 52.4.0.0/14 52.8.0.0/16 52.9.0.0/16 52.10.0.0/15 52.12.0.0/15 52.14.0.0/16 52.15.0.0/16 52.16.0.0/15 52.18.0.0/15 52.20.0.0/14 52.24.0.0/14 52.28.0.0/16 52.29.0.0/16 52.30.0.0/15 52.32.0.0/14 52.36.0.0/14 52.40.0.0/14 52.44.0.0/15 52.46.0.0/18 52.46.64.0/20 52.46.80.0/21 52.46.88.0/22 52.46.92.0/22 52.46.96.0/19 52.46.128.0/19 52.46.164.0/23 52.46.172.0/22 52.47.0.0/16 52.48.0.0/14 52.52.0.0/15 52.54.0.0/15 52.56.0.0/16 52.57.0.0/16 52.58.0.0/15 52.60.0.0/16 52.61.0.0/16 52.62.0.0/15 52.64.0.0/17 52.64.128.0/17 52.65.0.0/16 52.66.0.0/16 52.67.0.0/16 52.68.0.0/15 52.70.0.0/15 52.72.0.0/15 52.74.0.0/16 52.75.0.0/16 52.76.0.0/17 52.76.128.0/17 52.77.0.0/16 52.78.0.0/16 52.79.0.0/16 52.80.0.0/16 52.81.0.0/16 52.82.187.0/24 52.82.188.0/22 52.82.192.0/18 52.83.0.0/16 52.84.0.0/15 52.86.0.0/15 52.88.0.0/15 52.90.0.0/15 52.92.0.0/20 52.92.16.0/20 52.92.32.0/22 52.92.39.0/24 52.92.40.0/21 52.92.48.0/22 52.92.52.0/22 52.92.56.0/22 52.92.60.0/22 52.92.64.0/22 52.92.68.0/22 52.92.72.0/22 52.92.76.0/22 52.92.80.0/22 52.92.84.0/22 52.92.88.0/22 52.92.248.0/22 52.92.252.0/22 52.93.0.0/24 52.93.1.0/24 52.93.2.0/24 52.93.3.0/24 52.93.4.0/24 52.93.5.0/24 52.93.8.0/22 52.93.16.0/24 52.94.0.0/22 52.94.4.0/24 52.94.5.0/24 52.94.6.0/24 52.94.7.0/24 52.94.8.0/24 52.94.9.0/24 52.94.10.0/24 52.94.11.0/24 52.94.12.0/24 52.94.13.0/24 52.94.14.0/24 52.94.15.0/24 52.94.16.0/24 52.94.17.0/24 52.94.20.0/24 52.94.22.0/24 52.94.24.0/23 52.94.26.0/23 52.94.28.0/23 52.94.30.0/23 52.94.32.0/20 52.94.48.0/20 52.94.64.0/22 52.94.68.0/24 52.94.69.0/24 52.94.72.0/22 52.94.76.0/22 52.94.80.0/20 52.94.96.0/20 52.94.112.0/22 52.94.116.0/22 52.94.120.0/22 52.94.124.0/22 52.94.192.0/22 52.94.196.0/24 52.94.197.0/24 52.94.198.0/28 52.94.198.16/28 52.94.198.32/28 52.94.198.48/28 52.94.198.64/28 52.94.198.80/28 52.94.198.96/28 52.94.198.112/28 52.94.198.128/28 52.94.198.144/28 52.94.199.0/24 52.94.200.0/24 52.94.204.0/23 52.94.206.0/23 52.94.208.0/21 52.94.216.0/21 52.94.224.0/20 52.94.240.0/22 52.94.244.0/22 52.94.248.0/28 52.94.248.16/28 52.94.248.32/28 52.94.248.48/28 52.94.248.64/28 52.94.248.80/28 52.94.248.96/28 52.94.248.112/28 52.94.248.128/28 52.94.248.144/28 52.94.248.160/28 52.94.248.176/28 52.94.248.192/28 52.94.248.208/28 52.94.248.224/28 52.94.249.0/28 52.94.249.16/28 52.94.249.32/28 52.94.249.64/28 52.94.249.80/28 52.94.252.0/23 52.94.254.0/23 52.95.0.0/20 52.95.16.0/21 52.95.24.0/22 52.95.28.0/24 52.95.30.0/23 52.95.34.0/24 52.95.35.0/24 52.95.36.0/22 52.95.40.0/24 52.95.48.0/22 52.95.56.0/22 52.95.60.0/24 52.95.61.0/24 52.95.62.0/24 52.95.63.0/24 52.95.64.0/20 52.95.80.0/20 52.95.96.0/22 52.95.100.0/22 52.95.104.0/22 52.95.108.0/23 52.95.110.0/24 52.95.111.0/24 52.95.112.0/20 52.95.128.0/21 52.95.136.0/23 52.95.138.0/24 52.95.142.0/23 52.95.144.0/24 52.95.145.0/24 52.95.146.0/23 52.95.148.0/23 52.95.150.0/24 52.95.154.0/23 52.95.156.0/24 52.95.163.0/24 52.95.164.0/23 52.95.166.0/23 52.95.168.0/24 52.95.192.0/20 52.95.212.0/22 52.95.240.0/24 52.95.241.0/24 52.95.242.0/24 52.95.243.0/24 52.95.244.0/24 52.95.245.0/24 52.95.246.0/24 52.95.247.0/24 52.95.248.0/24 52.95.249.0/24 52.95.250.0/24 52.95.251.0/24 52.95.252.0/24 52.95.253.0/24 52.95.254.0/24 52.95.255.0/28 
    deny from 52.95.255.16/28 52.95.255.32/28 52.95.255.48/28 52.95.255.64/28 52.95.255.80/28 52.95.255.96/28 52.95.255.112/28 52.95.255.128/28 52.95.255.144/28 52.119.160.0/20 52.119.176.0/21 52.119.184.0/22 52.119.188.0/22 52.119.192.0/22 52.119.196.0/22 52.119.205.0/24 52.119.206.0/23 52.119.208.0/23 52.119.210.0/23 52.119.212.0/23 52.119.214.0/23 52.119.216.0/21 52.119.224.0/21 52.119.232.0/21 52.119.240.0/21 52.192.0.0/15 52.196.0.0/14 52.200.0.0/13 52.208.0.0/13 52.216.0.0/15 52.218.0.0/17 52.218.128.0/17 52.219.0.0/20 52.219.16.0/22 52.219.20.0/22 52.219.24.0/21 52.219.32.0/21 52.219.40.0/22 52.219.44.0/22 52.219.56.0/22 52.219.60.0/23 52.219.62.0/23 52.219.64.0/22 52.219.68.0/22 52.219.72.0/22 52.219.76.0/22 52.219.80.0/20 52.220.0.0/15 52.222.0.0/17 52.222.128.0/17 54.64.0.0/15 54.66.0.0/16 54.67.0.0/16 54.68.0.0/14 54.72.0.0/15 54.74.0.0/15 54.76.0.0/15 54.78.0.0/16 54.79.0.0/16 54.80.0.0/13 54.88.0.0/14 54.92.0.0/17 54.92.128.0/17 54.93.0.0/16 54.94.0.0/16 54.95.0.0/16 54.144.0.0/14 54.148.0.0/15 54.150.0.0/16 54.151.0.0/17 54.151.128.0/17 54.152.0.0/16 54.153.0.0/17 54.153.128.0/17 54.154.0.0/16 54.155.0.0/16 54.156.0.0/14 54.160.0.0/13 54.168.0.0/16 54.169.0.0/16 54.170.0.0/15 54.172.0.0/15 54.174.0.0/15 54.176.0.0/15 54.178.0.0/16 54.179.0.0/16 54.182.0.0/16 54.183.0.0/16 54.184.0.0/13 54.192.0.0/16 54.193.0.0/16 54.194.0.0/15 54.196.0.0/15 54.198.0.0/16 54.199.0.0/16 54.200.0.0/15 54.202.0.0/15 54.204.0.0/15 54.206.0.0/16 54.207.0.0/16 54.208.0.0/15 54.210.0.0/15 54.212.0.0/15 54.214.0.0/16 54.215.0.0/16 54.216.0.0/15 54.218.0.0/16 54.219.0.0/16 54.220.0.0/16 54.221.0.0/16 54.222.0.0/19 54.222.48.0/22 54.222.57.0/24 54.222.58.0/28 54.222.128.0/17 54.223.0.0/16 54.224.0.0/15 54.226.0.0/15 54.228.0.0/16 54.229.0.0/16 54.230.0.0/16 54.231.0.0/17 54.231.128.0/19 54.231.160.0/19 54.231.192.0/20 54.231.224.0/21 54.231.232.0/21 54.231.240.0/22 54.231.244.0/22 54.231.248.0/22 54.231.252.0/24 54.231.253.0/24 54.232.0.0/16 54.233.0.0/18 54.233.64.0/18 54.233.128.0/17 54.234.0.0/15 54.236.0.0/15 54.238.0.0/16 54.239.0.0/28 54.239.0.16/28 54.239.0.32/28 54.239.0.48/28 54.239.0.64/28 54.239.0.80/28 54.239.0.96/28 54.239.0.112/28 54.239.0.128/28 54.239.0.144/28 54.239.0.160/28 54.239.0.176/28 54.239.0.192/28 54.239.0.208/28 54.239.0.224/28 54.239.0.240/28 54.239.1.0/28 54.239.1.16/28 54.239.2.0/23 54.239.4.0/22 54.239.8.0/21 54.239.16.0/20 54.239.32.0/21 54.239.48.0/22 54.239.52.0/23 54.239.54.0/23 54.239.56.0/21 54.239.96.0/24 54.239.98.0/24 54.239.99.0/24 54.239.100.0/23 54.239.104.0/23 54.239.108.0/22 54.239.116.0/22 54.239.120.0/21 54.239.128.0/18 54.239.192.0/19 54.240.128.0/18 54.240.192.0/22 54.240.196.0/24 54.240.197.0/24 54.240.198.0/24 54.240.199.0/24 54.240.200.0/24 54.240.202.0/24 54.240.203.0/24 54.240.204.0/22 54.240.208.0/22 54.240.212.0/22 54.240.216.0/22 54.240.220.0/22 54.240.225.0/24 54.240.226.0/24 54.240.227.0/24 54.240.228.0/23 54.240.230.0/23 54.240.232.0/22 54.240.244.0/22 54.240.248.0/21 54.241.0.0/16 54.242.0.0/15 54.244.0.0/16 54.245.0.0/16 54.246.0.0/16 54.247.0.0/16 54.248.0.0/15 54.250.0.0/16 54.251.0.0/16 54.252.0.0/16 54.253.0.0/16 54.254.0.0/16 54.255.0.0/16 67.202.0.0/18 72.21.192.0/19 72.44.32.0/19 75.101.128.0/17 79.125.0.0/17 87.238.80.0/21 96.127.0.0/17 103.4.8.0/22 103.4.12.0/22 103.8.172.0/22 103.246.148.0/23 103.246.150.0/23 107.20.0.0/14 122.248.192.0/18 172.96.97.0/24 172.96.98.0/24 174.129.0.0/16 175.41.128.0/18 175.41.192.0/18 176.32.64.0/19 176.32.96.0/21 176.32.104.0/21 176.32.112.0/21 176.32.120.0/22 176.32.125.0/25 176.34.0.0/19 176.34.32.0/19 176.34.64.0/18 176.34.128.0/17 177.71.128.0/17 177.72.240.0/21 178.236.0.0/20 184.72.0.0/18 184.72.64.0/18 184.72.128.0/17 184.73.0.0/16 184.169.128.0/17 185.48.120.0/22 185.143.16.0/24 203.83.220.0/22 204.236.128.0/18 204.236.192.0/18 204.246.160.0/22 204.246.164.0/22 204.246.168.0/22 204.246.174.0/23 204.246.176.0/20 205.251.192.0/19 205.251.224.0/22 205.251.228.0/22 205.251.232.0/22 205.251.236.0/22 205.251.240.0/22 205.251.244.0/23 205.251.247.0/24 205.251.248.0/24 205.251.249.0/24 205.251.250.0/23 205.251.252.0/23 205.251.254.0/24 207.171.160.0/20 207.171.176.0/20 216.137.32.0/19 216.182.224.0/20 54.183.255.128/26 54.228.16.0/26 54.232.40.64/26 54.241.32.64/26 54.243.31.192/26 54.244.52.192/26 54.245.168.0/26 54.248.220.0/26 54.250.253.192/26 54.251.31.128/26 54.252.79.128/26 54.252.254.192/26 54.255.254.192/26 107.23.255.0/26 176.34.159.192/26 177.71.207.128/26 54.222.20.0/22 205.251.192.0/21 13.54.63.128/26 13.59.250.0/26 13.113.203.0/24 13.124.199.0/24 13.228.69.0/24 18.216.170.128/25 34.195.252.0/24 34.216.51.0/25 34.226.14.0/24 34.232.163.208/29 35.158.136.0/24 35.162.63.192/26 35.167.191.128/26 52.15.127.128/26 52.47.139.0/24 52.52.191.128/26 52.56.127.0/25 52.57.254.0/24 52.66.194.128/26 52.78.247.128/26 52.199.127.192/26 52.212.248.0/26 52.220.191.0/26 54.233.255.128/26 13.55.255.216/29 13.56.32.200/29 13.112.191.184/29 34.228.4.208/28 34.250.63.248/29 35.157.127.248/29 35.176.92.32/29 35.182.14.48/29 52.15.247.208/29 52.43.76.88/29 52.221.221.128/29
    

    Thanks. That’s some list!

    I’m successfully blocking Amazon AWS using Wordfence. And I think I’ll make more use of its blocking facility.

    Plugin Author Bill Minozzi

    (@sminozzi)

    Hi,
    I added 2 more questions in our faq page and topics in our On Line Guide about this.
    Faq Page
    On Line Guide
    Cheers,
    Bill

    Plugin Author Bill Minozzi

    (@sminozzi)

    Hi,

    We just release the version 5.28
    Now, If the user agent contain the nickname (or match the IP), our system show to them one 403 Forbidden screen. Then, in your statistics like webalizer or visitor metrics, you can see status 403 (forbidden) and 0 bytes.

    Cheers,
    Bill

    Interesting! I’ve also added quite a lot to robots.txt and added filters to Wordfence.

    I know this isn’t really relevant to your plugin but just for info –

    My main problem is still Bingbot. On our non-commercial running club site google bots are using about 1Gb a month, and Bingbots a massive 8Gb. Most of our referral traffic comes from Google, and a tiny amount from Bing. I don’t want to completely bar it but just make it use a reasonable amount.

    It appears to have 3 main IP addresses. I’ve more or less successfully throttled it (120 secs) which it seems to respond to by crawling in rapid batches with longer gaps. I’ve also added a lot of various disallows.

    The battle goes on!

    Plugin Author Bill Minozzi

    (@sminozzi)

    Hi Digbymaass

    Wordfence is not our product. Then, please, post in their support forum.
    About bing spending too much bandwidth, i suggest to you ask for support in bing site.
    This support forum is only for Stop Bad Bots plugin.

    Cheers,
    Bill

    • This reply was modified 5 years, 10 months ago by Bill Minozzi.
Viewing 12 replies - 1 through 12 (of 12 total)
  • The topic ‘Some Bots on Table Not Actually Being Blocked’ is closed to new replies.