Thanks you for your feedback. Believe me when I say that I spent much of my time on this projected working out the best way to complete an efficient scan of all the files on a site with the least amount of impact on the site while the scan is running. Most of the servers that I was testing this plugin on were also small and under-powered, just like the VPS from DO that you are using. Also understand that you cannot directly control how much processing power a PHP script uses through PHP code. Much of the load balancing and task switching is handled at the server level and has more to do with the way that PHP and Apache (or nginx) is configured on your server.
That said, the Complete Scan is a hybrid of PHP and JavaScript that uses multiple separate ajax calls in a linear order to disperse the processing power over a reasonable time to lessen the overall impact of the scan on the server’s system resources. To understand this better you can compare this technique with the “Quick Scan” options, which were the first attempt at a scan process that used only a single call to the PHP script. In this original version the Quick Scan could complete a scan of a great many files in a very short amount of time but it would consume all of the available system resources until the job was done, and in many cases the servers simply didn’t have or didn’t allocate enough memory for the whole job, or else there might be so many files that it simply took too long to complete and the server quit the script and timed out the whole process. This lead me to break up the scan into smaller sets of file such as the “Quick Scan” for Plugins, Themes, and Core Files, and also limit the scan depth to prevent drilling down into directory trees that would take too long to index and scan. While this was more successful it was also not 100% thorough and it would still fail sometimes on smaller servers or larger filesystems.
This brings me to the current hybrid scan that I call the Complete Scan. In this process the directories are indexed first and listed in a linear order, to be scanned individually and only one at a time. While this does inevitably increase the overall time that it take to scan the whole site it also allows other normal traffic to get a fair amount of time to be processed and ensures that your visitor will continue to receive their page request while the scan is running. Even if your server only have one processor, Apache is designed to answer a reasonable number of requests at the same time and will call on PHP to process as many simultaneous jobs as would be possible with limited amount of memory your server has allocated to them, and during the Complete Scan you can rest assured that only one of these processes will be my scan job. So you may notice that you one processor is completely maxed out during the scan, and it may be true that this intensive scan job is taking up 100% of that processor time if there are no other pages being requested at that time, but you will find that if there are 3 other pages being requested at the same time then my process will only be consuming the appropriate amount of processor time as the other processes get their fair share. Additionally, you will see that the next directory in the scan list will not be started until the prior directory scan job has finished, which gives the server a short break and allows any other pages in the request queue to be processed before starting the next scan job.
I believe that this is about the best I can do, given the enormity of the task at hand and the vast variation of server configurations out there. I hope that this explanation of the inner workings of my scan process has helped you. Please let me know if you have anything further to add or any ideas that might help to improve the process further.