Virtualmin crashing and all websites offline until sever reboot

Hello, I am having issues on a relatively new server whereby the CPU will spike and the whole server will crash and Virtualmin will fail to serve any websites (page not found etc), and you are unable to access the Virtualmin gui at the same time. When this happens, I can ssh into the server, and reboot it - This resolves the issues for another couple of weeks (max) and then I’ll get alerts to say all of the websites are offline again (from my uptime monitoring software) and I’ll have to ssh into it and reboot it.

The server is a VM droplet hosted on DigitalOcean. The DigitalOcean statistics show that average CPU% over 24hrs will usually not peak anything above 20%. Memory usage never rises above 50% used, and bandwidth to and from the server is minimal so I know there is no DDOS attack or similar. When the issue occurs the DigitalOcean stats show CPU running at between 80-100%, the memory stays the same (approx. 50%), but the Disk IO jumps up to 150MB/s whereas is usually sat below 50MB/s.

The VM is a Standard with 2 vCPUs, 4GB RAM, 80GB SSD and is running Debian 10.3.

The server is hosting approx. 12 wordpress websites – all small business sites, nothing with large files or a high amount of traffic (in the past I’ve run many more sites with less VM resources without any issues).

When it does crash, I have little time to troubleshoot and gather logs as I just need to get the websites back online as quickly as possible (and rebooting the whole server is the quickest way).

Below are the server details from the Virtualmin dashboard:

System hostname web-1.
Operating system Debian Linux 10
Webmin version 1.942
Usermin version 1.791
Virtualmin version 6.09 Authentic theme version 19.46
Time on system Friday, July 10, 2020 11:07 AM
Kernel and CPU Linux 4.19.0-9-cloud-amd64 on x86_64
Processor information Intel® Xeon® CPU E5-2630L v2 @ 2.40GHz, 2 cores
System uptime 1 hours, 55 minutes
Running processes 162 CPU load averages 0.40 (1 min) 0.39 (5 mins) 0.27 (15 mins)
Real memory 1.3 GiB used / 1.71 GiB cached / 3.85 GiB total
Local disk space 10.8 GiB used / 67.9 GiB free / 78.71 GiB total

I previously had a Virtualmin server hosted on Linode on Ubuntu 14 and that ran flawlessly for 3 or 4 years that was hosting approx 20 websites. However, this new server on DigitalOcean has been doing this since day one (approx. 3 months ago). We have many other servers used for other purposes (PBX systems, File Storage etc) on DigitalOcean, and they are mostly Debian10 as well as these run fine.

We take a backup every night using the Virtualmin GUI to send copies of the websites to an FTP server.

I suspect there’s something broken in the Virtualmin setup somewhere but I am unsure as to where to look next to try and pin it down…?

Hi,

This doesn’t look like Virtualmin issue at all!

Most commonly, you would want to start from checking Apache and MariaDB logs, alongside with system logs (check for OOM Killer) to get a clearer picture of what is really going on.

Additionally, you may want to try to double the size of RAM and see if this issue is getting resolved – if so, you could remove earlier added RAM, and adjust running services accordingly.

Hi,

Thanks for the reply. The server crashed again after only 2 days of uptime. At around 02:45am it just stopped serving all web requests and I got an alert from my external HTTP monitor service. I have then logged into it via SSH and have looked at the log files.

I was able to restart Virtualmin with the command /etc/init.d/webmin start

From there I was able to see that services were showing as running ok, with the exception of Apache – this was showing as stopped.

I’ve exported a few of the logs for apache and there a few lines such as

mmap() failed: [12] Cannot allocate memory
[Mon Jul 13 02:41:36.994572 2020] [core:notice] [pid 723:tid 140016382289024] AH00052: child pid 30748 exit signal Aborted (6)

[Mon Jul 13 02:41:41.066054 2020] [mpm_event:alert] [pid 17526:tid 140016382289024] (11)Resource temporarily unavailable: AH00480: apr_thread_create: unable to create worker thread

[crit] Memory allocation failed, aborting process.

[Mon Jul 13 02:41:42.043960 2020] [core:notice] [pid 723:tid 140016382289024] AH00052: child pid 30745 exit signal Aborted (6)

[Mon Jul 13 02:41:42.044367 2020] [fcgid:error] [pid 723:tid 140016382289024] mod_fcgid: fcgid process manager died, restarting the server

[Mon Jul 13 02:41:45.392299 2020] [core:warn] [pid 723:tid 140016382289024] AH00045: child process 17486 still did not exit, sending a SIGTERM

[Mon Jul 13 02:41:47.394550 2020] [core:warn] [pid 723:tid 140016382289024] AH00045: child process 17486 still did not exit, sending a SIGTERM

[Mon Jul 13 02:41:49.396749 2020] [core:warn] [pid 723:tid 140016382289024] AH00045: child process 17486 still did not exit, sending a SIGTERM

[Mon Jul 13 02:41:51.398933 2020] [core:error] [pid 723:tid 140016382289024] AH00046: child process 17486 still did not exit, sending a SIGKILL

[Mon Jul 13 02:41:52.400135 2020] [mpm_event:notice] [pid 723:tid 140016382289024] AH00494: SIGHUP received. Attempting to restart

AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.1.1. Set the 'ServerName' directive globally to suppress this message

[Mon Jul 13 02:41:52.517441 2020] [mpm_event:notice] [pid 723:tid 140016382289024] AH00489: Apache/2.4.38 (Debian) mod_fcgid/2.3.9 OpenSSL/1.1.1g configured -- resuming normal operations

[Mon Jul 13 02:41:52.517475 2020] [core:notice] [pid 723:tid 140016382289024] AH00094: Command line: '/usr/sbin/apache2'

Clicking the start button in the services for Apache has brought all of the websites back online.

I understand your comment regarding the RAM, however I’m slightly confused since I was able to serve this same set of websites on the previous old system on Linode running Virtualmin on Ubuntu 14 with only 2GB RAM. This system has 4GB. Plus my DigitalOcean control panel shows that the RAM usage never peeks above more than 50%, even just before the Apache crashing.

Are there limits to how much memory the individual services such as apache and mysql can take up? Is it possible apache is hitting some sort of false limit and crashing from there?

Many thanks

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.