PHP-FPM causing server to crash with massive CPU spikes on very small WordPress websites

AdamP · August 8, 2022, 12:54pm

I’ve been having similar problems on Ubuntu 20.04. There weren’t any major changes, but the server started randomly freezing up where I’d have to hard reset it. This would happen sometimes only once a day, but usually as many as 3 times per day. I was using PHP-FPM for all virtual servers after disabling mod_php. I have now switched all virtual servers to FCGId and it doesn’t appear to be freezing anymore.

Edit: Scratch that - PHP doesn’t have write permissions at all with FCGId or CGI Wrapper modes. I had to change them all back but used the settings Dibs suggested above.

Dibs · August 8, 2022, 9:26pm

@AdamP - Welcome to the forum.

Do monitor your PHP error logs for messages of the sort

WARNING: [pool something] server reached pm.max_children setting (n), consider raising it.

where (n) is what you’ve set it to. That usually indicates the settings need adjusting upwards (usually).

HIH

Dibs

dougtracey · August 10, 2022, 1:04pm

Hi Dibs,

So PHP Limits are set as follows by default:
pm = dynamic
pm.max_children = 20
pm.start_servers = 1
pm.min_spare_servers = 1
pm.max_spare_servers = 5
php_admin_value[memory_limit] = 256M
php_admin_value[max_execution_time] = 180
php_admin_value[max_input_time] = 120
php_admin_value[post_max_size] = 64M
php_admin_value[upload_max_filesize] = 64M

Would you say any setting here could allow any 1 server to have use 70-90% of the AMD Ryzen 7 PRO 3700 8-Core Processor, 16 cores at one time? Dropping the memory to 128MB caused the sites to crash.

Dibs · August 10, 2022, 1:53pm

@dougtracey - the last 2 settings to my mind are neither here nor there really. From looking back at your earlier posts\pictures - the issues seem primarily related to the “www” and “snagadmin” pools.

For the default server\pool - the one that is running under “www” - do you have a website there?

Also - I’m assuming www\snagadmin have WordPress sites (ignore if they don’t), my advice would be to turn off all the plugins - watch performance.

I usually run TOP in a SSH (or shell) connection and hold down SHIFT and hit the “</,” key 3 times to order the results (could be the “>/.” key tho as I can’t quite remember).

Assuming you aren’t getting spikes in CPU, enable the plugins 1 at a time and see what happens to the CPU. My suspicion is you may have a plugin that is misbehaving or badly written.

As an aside - I’d suggest trying a less than current version of PHP, say PHP 7.4. It is possible that 8.0 might have a bug etc. You can obviously have both versions at the same time and choose which one that Virtual Server sees\uses. All mine (except 1) are running on 7.4. I’d be tempted to try this before “testing” any plugins.

HIH

Dibs

dougtracey · August 10, 2022, 3:55pm

Thanks @Dibs you are right to assume that I’m hosting Wordpress. Having had a few complaints with all the messing around, I’ve moved the snagging site to its own TSO VPS running 6vCPUs and 24GB, of memory running Alma 8 OS.

It allows me to isolate what is causing the issues and I’ve even installed WP Crontrol to see what cron jobs are running. I’ve delayed a couple of Hummingbird crons for preloading cache that was set for every minute, but it looks like the spikes are happening on every page load, which is weird. The site didn’t have any traffic. Even disabling Hummingbird the issues were still there.

While running Hummingird Asset Optimisation:
**

**
The big problem on my main server is that it’s a lot of the servers doing this now, so it’s very unstable at the moment…but at least it’s running pretty fast!

Dibs · August 10, 2022, 4:27pm

@dougtracey - I’ve just browsed that site and it seems far from complex. I run at least one site with more pages than that, a ton of images and it barely causes a 2vCPU, 1GB RAM VPS to even stir, let alone break a sweat. And I run a minimal set of plugins on that and all the sites.

My advice:

Get a clean o\s install (perhaps try Debian or Ubuntu - 1 version behind the latest), make sure a less than latest version of PHP is installed (I know 7.4 is rock solid).
Lock it down etc.
Install Virtualmin
Create your Virtual Server (make it a “test” subdomain of smaggingcompany.com and either update the DNS or just add a line in your hosts file if using Windows as the client - this way you won’t affect production)
Install wordpress - load\copy your site in (minus all the plugins), just the template & content.

I would expect it to barely stir if it’s sized like the prod server or close. Then start adding your necessary plugins one at a time and see what happens. When I say necessary - once that really are, don’t get carried away with plugins because you “think” they are necessary.

At some point, you should see it go nuts on the CPU, if one of the plugins is at fault. If you see no issues at all - then that would mean the o\s has a “feature”.

HIH

Dibs

p.s. just spin up a VPS at somewhere like Linode for effectively free and test things for something like 60 days and little to no cost. I can always give you a referral code for 60 & $100.

AdamP · August 10, 2022, 7:20pm

Thank you for the tips and the warm welcome. I’ve been monitoring all of my sites and have not seen any such messages. Ontop of that, my server has now been up for almost 2 days, which is a massive improvement over how it was acting for a couple weeks straight!

I do also sometimes see quite a bit of CPU usage by the default www pool, but I don’t know what is using that. Could that be part of the problem?

@dougtracey I was having similar problems with the settings you’re using and they all went away when I lowered the max_children from 20 to 5. My suggestion would be to start low and work your way up as needed.

Dibs · August 10, 2022, 7:37pm

@AdamP - what Virtual Server is using the www pool? You might need to look thru all the pool conf files - something usually gives it away. There might be an easier way - but I haven’t tried to look for one yet.

HIH

Dibs

AdamP · August 10, 2022, 7:58pm

in the directory (/etc/php/7.4/fpm/pool.d/) I see all of the config files. All of are named strings of seemingly random numbers. There is one file for each site except one named www.conf. It doesn’t seem to point to any specific website/directory and it is the only file that references www or the user www-data.

I don’t want to continue hijacking this thread, but I’m just not sure what that pool is for or why it would have high CPU usage occasionally.

Dibs · August 10, 2022, 9:25pm

@AdamP - I’d like to think no one is going to get bent out of shape and ask you to start your own thread on the same subject. So it’s all good.

I’ve just looked at mine and there is a www one too, and running a

TOP -u www-data

command where www-data is the user\group in the pool - the values are single digit for SHR and 0.0% for CPU. I don’t have a website on the default Virtual Server which I think it might be for but then again I think the www-data pool\user might be for Apache itself. I might have to look into it & read up on what that pool\group\etc is for.

HIH

Dibs

dougtracey · August 10, 2022, 9:40pm

Thanks guys, it seems to be happening with poor execution of WordPress processes using massive amounts of resources. This is one of my sites I’m running SG Migrator to see if it works any better on Siteground as the client already have an account. As you can see it’s totally wiping out my server resources running this process!

Dibs · August 10, 2022, 9:48pm

@dougtracey - that’s insane.

Dibs · August 10, 2022, 9:58pm

@AdamP - I’m just looking thru the docs and it would appear that the www.conf pool is the default pool created and would be used by all processes\sites. Until you change the setup to use a pool per Virtual Server.

I’ve gone thru all my pool files (the numeric named ones) and made a note of the file name, user\group and the socket\port nbr. All the virtual servers have their own pool.

In the morning, I’m just going to double check in sites-available (I’m using Apache) to make sure all the sites are using their own pools. and then perhaps delete the www.conf pool file (making a copy) and then restart php-fpm and see what happens. It might be that it is the “template” file for new virtual servers, but that’s easy enough to test - just change the number to be slightly different to what is in use & create a new virtual server and then check it’s pool conf file.

If you’ve got CPU usage for the www pool - then I wonder if one of your sites in sites-available is using it.

Will post up tomorrow how I get on - i.e. if it all goes bang or is OK. LOL

HIH

Dibs

dougtracey · August 10, 2022, 11:05pm

Thanks @Dibs, just seen another huge spike whne Virtualmin is doing an all server backup to AWS wit GZip causing the spike!!..guess this is why I run it overnight!

Dibs · August 11, 2022, 10:50am

@AdamP & @dougtracey - I was reading up on Google and I did find mentions that the www pool conf file is the default file when php-fpm is installed and if you have correctly configured individual pools - the www.conf file should be renamed to something like www.conf.template as nothing should be using it.

This stops a pool being started due to the include directive at the end of the /etc/php/7.4/fpm/php-fpm.conf file.

I’ll be experimenting on a system later, after the caffeine has kicked in.

Cheers

Dibs

Dibs · August 11, 2022, 5:43pm

@AdamP & @dougtracey - folks, quick update on the www.conf pool.

I checked all the “files” in sites-available, cross referenced them with the numeric pool conf files and things matched. So just renamed www.conf to www.conf.template (for both php7.4 & php7.2 - I have both) and restarted both php-fpm services and no more www-data user owned php-fpm process showing in top.

I was actually running top -u www-data at the time in a shell window and watched them disappear when I restarted the php-fpm services. www-data being the user\group in the www.conf pool files.

HIH

Dibs

AdamP · August 12, 2022, 2:13pm

This is wonderful information! You’re a stellar human for spending the time tracking this down! I wonder if this should be reported to the Virtualmin devs as a bug…?

Dibs · August 12, 2022, 5:39pm

I don’t think it’s a real bug. When php-fpm is installed, the default mode is everything runs under one pool - the www pool. When you switch to individual pools, the www conf file needs to be renamed out of the way.

To my mind - more a “missing” thing in php-fpm’s documentation than anything else.

Still interested in how @dougtracey is getting on.

hescominsoon · August 14, 2022, 4:06am

@Dibs so dies virt NOT setup a pool for each site by default? It is just putting all of them into the central www-pool by default?

Dibs · August 14, 2022, 12:33pm

@hescominsoon - I was referring to the default behaviour of php-fpm.

Virtualmin - each time I add a new virtual server, it creates a new pool.