I am trying to troubleshoot an intermittant server crash that is happening on my virtualmin server every few days. I notice that RAM gets higher and higher until it tops 80% and some time after that the crash happens
My main suspect in this is wordpress, which I installed via the Virtual Min manage applications i.e. the standard VirtualMin way. There seem to be 4 or 5 running processes for the same WP site? At 200 odd Meg this adds up pretty quickly
Why are so many processes needed to run a single WP site?
Also - any tips etc for how to investigate and resolve this kind of problem would be much appreciated
Because more than one connection is happening at any given time. The internet does not wait its turn in a line, so one process alone canât serve the website for all those users. (Even a single user can generate many simultaneous requests.)
Youâre overestimating how much memory is being consumed, though. All php-fpm processes for the same version of PHP will share quite a bit of memory, because theyâll all be loading a bunch of the same shared libraries (this is the same with a lot of other stuff that fires up multiple processes, if youâre simply adding them all up, youâre overestimating memory used by a lot).
Itâs possible you are running out of memory, but 80% usage does not indicate that to be the case. You should check the kernel log (dmesg or the journal) for Out Of Memory errors. If you see those, you are definitely running out of memory.
Also, what does âserver crashâ mean here? Is the server itself literally crashing? If so, that has nothing to do with memory, or any software you have control over. A server crash indicates hardware or kernel bugs. Nothing in user space can crash the server, without something in hardware or the kernel failing. But, if you just mean, âthe website isnât respondingâ, thatâs a different issue, with different troubleshooting steps, and we need to differentiate between them.
Thank you Joe that is very helpful. Yes I was just adding up the numbers of memory. Plus the 80% thing is good to know
The server crash is a full crash, the server is completely unresponsive and I canât reach it via virtualmin and I canât even SSH into it. I go into the AWS console and restart it which brings it back up
Do you have any advice on where to start? I havenât been able to link the crash with any particular event on the server, it is not taking production load at this point so there isnât anthing obvious like massive backups, high web traffic etc. And I guess even if these were present they wouldnât necessarily lead to a full on crash
Then itâs either a firewall gone rogue (check fail2ban logs to see if youâre locking yourself out somehow), or the system is literally crashing, and thereâs little you can do about itâŠyour AWS VM is broken. You should make sure your OS is up to date, and if that doesnât fix it, and if you canât find any clues in the kernel log about why itâs crashing, youâll need to take it up the chain to someone who can help (we canât, Amazon is selling you an unreliable system, it seems).
As I said, nothing in user space should be able to crash the kernel. If it can, itâs a kernel bug or a hardware bug. Even an out of memory event (which is catastrophic for user processes that must be killed to keep the system running) shouldnât be able to crash the systemâŠthe kernel will kill whatever it needs to in order to keep running. Itâs possible it would kill every means if getting in, but it seems unlikely unless something is really running away with memory and also working really hard so the kernel thinks itâs the most important thing. So, maybe the kernel killed Webmin and the ssh server. But, if you canât get in via the AWS serial console, it really is dead. If you havenât tried getting in via the AWS serial console (or whatever they call it, I assume there is some way to see the actual console, basically a virtual KVM for a virtual machine), you should try that before deciding itâs definitely crashed. It may just be ssh and Webmin have been slain by the OOM killer.
You might find that taking a snapshot, shutting it down and bringing up a new VM from the snapshot, maybe in another region or with different specs, might get you better reliability. If AWS is having hardware troubles and youâre unlucky enough to be stuck with a failing system they havenât caught yet, that might be a solution.
That quiet large, pool sizes. Are you running Dynamic?
I switched to ondemand and found a big drop in memory usage.
Iâve posted this many times but seems people donât take much notice.
Here what I settle on in the php template.
If the site is idle the pools will disappear until new activity.
is that screenshot complete and accurate, or just an example?
I ask because my wordpress sites using php-fpm routinely will have 3-4-5 php-fpm (child) processes running all the time, with no issues.
in general we have seen other php-fpm issues where the config allows 20-50-100 child processes, which many systems cannot handle â one solution to THAT is to limit the number of child processes to a level the system CAN handle.
OK thanks. AWS does have what they call a serial connect but its only available on nitro instances which mine isnât Also they have what they call and âinstance connectâ which I am not sure exactly what that is.
In any case when it happens again Iâll try to connect via the AWS console as first port of call
I mean, a virtual machine would not have a literal serial console, but it would be a virtual equivalent. I just donât know what itâs called for AWS. But, I believe they all have a way to login to a console that doesnât use ssh and connects directly to the âscreenâ that the kernel is producing, which should be immune to the OOM killer, since it is literally part of the kernel.
Thatâs always true of all services at AWS. You must explicitly open all the ports you need (except sshâŠmaybe they all 80/443 by default, too). Nothing to do with Virtualmin.
You either set a root password (sudo su - root and then run passwd) or login as your sudo-capable user instead. Same as any other system that doesnât have a root password after OS installation.
But, this is off-topic. Please open a new topic for any further discussion about running Virtualmin on AWS. FAQ - Virtualmin Community