• I am running/managing an Ubuntu server that has 96Gb RAM and 16 cores. Whenever the weekly newsletter is sent out, the website is hit with a large number of requests, mostly from bots. In order to ensure the workload is managed appropriately is it acceptable to throttle WordPress’ PHP so it’s not competing with MySQL eg by setting proc_nice within the PHP to a slightly lower setting than normal, so MySQL will process all the SQL quickly and PHP will consume the spare server resources, or is it best to just allow Linux to work out who gets what?

    Scenario
    A newsletter is sent out. Bots seem to process as well as humans. 100s of page requests are sent per second over 3 minutes. The server picks up the load and runs flat out for approx 3.5 minutes.

    What I was considering is if the PHP priority is lowered a little for system resources, processing will queue in PHP-FPM and Apache and flow into the server without just dumping all requests at once which can max out the MySQL connections and cause page drops or potentially slow page times, rather than use PHP to process and send requests to MySQL as resources are available and thereby the request flow through the server more smoothly. Thoughts? Is this a practical idea?

    Obviously, the alternative is to just ensure there are sufficient MySQL connections and let the server manage the competing demands. At the workload peak, Linux is reporting (Load1) that there is sufficient workload for 51 cores, hence asking the question.
    MySQL – V8.0.34, PHP – V8.2.11
    Thank you. Any thoughts or advice welcomed.

    The page I need help with: [log in to see the link]

Viewing 2 replies - 1 through 2 (of 2 total)
  • Hello reecejames, & welcome.

    It’s late. I’ll likely ramble. Please just take what you think is worthwhile & forget I ever said the rest.

    This is always a problem, no matter the server size, & you’ve got a honker! Just a few things I’m wondering.

    1. Do you use any caching on your site? That might prove very helpful.
    2. Is there a reason you’re not using Nginx? It tends to be more efficient under high loads *if configured correctly*. I personally like it. A lot!
    3. Have you tried using a robots.txt file, both to rate limit the bots as well as deny pages you don’t want them searching?
    4. Are you using a firewall & blocklist combination to deny bad bots, as per abuseipdb.com for example?

    1 thing I always like to do is study my logs in order to see where the bad traffic is coming from. Blocking IP’s is somewhat like playing wack-a-mole, because they can be spoofed, etc, but it is a start, especially w/IP’s that are chronically abusive.

    Have you implemented any sort of captcha protection of your newsletter? Please make sure whatever you implement in that regard is accessible to all (many captchas are not), but Google’s ReCaptcha is, & it might slow down the bad bots.

    A CDN might also help.

    I think that’s at least a start. Please let us know if you have additional questions.

    Thread Starter reecejames

    (@reecejames)

    Hi abletec
    Thank you for your reply.
    Yes, everything is cached, in fact, I have turned the server so it performs very little read I/O, almost none from MySQL and all images etc are cached. One of my speed goals previously was to ensure the site has all its content in memory, hence the honker ??

    Yes, we use cPanel and until recently cPanel didn’t really support NGINX. The other reason is the business owner doesn’t want to use any software written in Russia and as you’ll know, NGINX falls into that category. Regardless there is probably, because I have plenty of RAM little difference between NGINX and Apache as I’m well aware of NGINX’s improved scalability from a RAM perspective.

    Yes, there is a robots.txt file, however, the bots visiting we want to, other are excluded and also many bots ignore the robots.txt file.

    Yes, bad bots are blocked. These aren’t bad bots generating the workload.

    As a result of the way the site works and generally the server has low usage a CDM isn’t worth the additional hassle and site design to meet their requirements.

    I think, and appreciate the response, that you’ve come at the problem as if this is about preventing bad bots. It’s not a bad bots issue. The bots we want to trawl the site. It’s the consequence of the immediate number of pages requested and users hitting the site. This is just a, how best to manage the additional workload, question at it’s heart, hence wondering if I show the PHP processing down a little, allow MySQL to get what it wants whenever, new requests, if needed will queue in PHP-FPM/Apache and the workload should flow through better. Is this practical?
    Thoughts?
    Thank you!!!

Viewing 2 replies - 1 through 2 (of 2 total)
  • The topic ‘Bots Causing heaving load on server can this be throttled’ is closed to new replies.