• I have a site that outputs about 400 files when done, and I’m excluding the uploads folder (Media Library contents) as I have about 10,000 images (that I handle separately). The generation used to take like 3 minutes in a VMWare + PHP 7.4 + MariaDB + OpenLiteSpeed setup. Not fast, but I was relatively happy with it.

    Now I’m in Windows 11 WSL2 Ubuntu 20.04 with Docker (not Docker Desktop), tried official WP image (with Apache and MySQL) PHP 7.4 and 8.1, PHP-FPM version behind Nginx, MariaDB, with and without WP-Cron, no matter what combination I tried, the same thing takes 10-20 minutes. Is there a hidden throttling/delay between requests to prevent abusing the server? Because looking at the CPU usage, it’s just chillin’ below 10% usage, like the script is not really trying to hammer the server (but it could take it). Loading a single page locally is under 200ms (it’s a fast handcoded site even without static – barely any plugins), and a few hundred of them with rejecting any links to images, and saving the files should still only take less than two minutes, especially when multithreaded (I understand this plugin doesn’t do that but regular static site generators that work from .md files probably do). Looking at the access log: it’s requesting a page every now and then like every second or so. Where is the bottleneck? I’ll try OpenLiteSpeed in Docker as a last resort…

    Looking at the debug log, it’s full of lines containing my exclude keyword and determining that it’s “excludable”. Why are these even added to the queue/db?

    If I change class-ss-fetch-urls-task.php to

    
    foreach ( $urls as $url ) {
    	if(stripos($url, 'myexcludeword') !== false){
    		continue;
    	}
    	$this->set_url_found_on( $static_page, $url );
    }
    

    and basically apply my exclude filter before the URL is set for later processing in the database, it doesn’t say it is trying to fetch 10k+ pages/files at least, but it is still slow to save those 400 pages… just saved a few minutes.

Viewing 2 replies - 1 through 2 (of 2 total)
  • What do you think is slow?

    For me the time to create a archive (zip) is ok.
    However the time to commit the files to github or upload them to the cdn is terrible slow.

    Thread Starter Firsh

    (@firsh)

    I think it’s the crawling or site/link discovery portion. During my experiments I’ve found out my setup is slower in WSL than in VMware, affecting page load times (same setup is 2-3x slower in WSL, no one knows why). That reflects the slowdown I got in this plugin too.

    In the end it comes down to not creating connections more aggressively which could only be done with multithreading. I’ve never coded such thing in PHP so not sure how it would work, but if it were a configurable setting, people could experiment with stuff like 2-3 parallel page downloads at least.

    I save the files to a directory instead of a ZIP since I have a postprocessing step that does what caching plugins used to do (minify and combine the CSS and JS, do global search and replace, with Grunt), before committing and pushing GH. All the large assets (images and video) are NOT handled by this plugin for me, I upload them by syncing separately, it’s ok for me to be not version controlled (deletions are delayed). Netlify automatically builds a preview from the pushed commit from GitHub and I accept the changes when I make it live.

Viewing 2 replies - 1 through 2 (of 2 total)
  • The topic ‘How to make it faster?’ is closed to new replies.