MarisDB server is down

jpour · July 6, 2024, 9:46am

The MariaDB server is down some days a week. After restarting it, it works normally and may be down two days later.
MySQL error log
Version: ‘10.5.23-MariaDB-0+deb11u1’ socket: ‘/run/mysqld/mysqld.sock’ port: 3306 Debian 11
2024-07-06 7:35:56 3 [Warning] Access denied for user ‘root’@‘localhost’ (using password: NO)
2024-07-06 7:35:56 4 [Warning] Access denied for user ‘root’@‘localhost’ (using password: NO)
2024-07-06 7:35:57 0 [Note] InnoDB: Buffer pool(s) load completed at 240706 7:35:57

systemctl result:
Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Wed 2024-06-19 10:13:09 UTC; 2 weeks 0 days ago
Docs: man:mariadbd(8)

Main PID: 1104865 (code=exited, status=0/SUCCESS)
Status: “MariaDB server is down”
CPU: 3min 35.300s

SYSTEM INFORMATION	B
OS type and version	debian 11
Webmin version	2.111
Virtualmin version	7.10.0
Webserver version	REQUIRED
Related packages	SUGGESTED

jimr1 · July 6, 2024, 9:53am

It may be being killed by OOM (out of memory), how much ram do you have and how have you configured maria, are you running wordpress with loads of plugins ? Please supply some more details

jpour · July 6, 2024, 10:25am

thanks
RAM =6 GB
# The MariaDB configuration file
#
# The MariaDB/MySQL tools read configuration files in the following order:
# 0. “/etc/mysql/my.cnf” symlinks to this file, reason why all the rest is read.
# 1. “/etc/mysql/mariadb.cnf” (this file) to set global defaults,
# 2. “/etc/mysql/conf.d/*.cnf” to set global options.
# 3. “/etc/mysql/mariadb.conf.d/*.cnf” to set MariaDB-only options.
# 4. “~/.my.cnf” to set user-specific options.
#
# If the same option is defined multiple times, the last one will apply.
#
# One can use all long options that the program supports.
# Run program with --help to get a list of available options and with
# --print-defaults to see which it would actually understand and use.
#
# If you are new to MariaDB, check out Basic MariaDB Articles - MariaDB Knowledge Base

#
# This group is read both by the client and the server
# use it for options that affect everything
#
[client-server]
# Port or socket location where to connect
# port = 3306
socket = /run/mysqld/mysqld.sock

# Import all .cnf files from configuration directory
!includedir /etc/mysql/conf.d/
!includedir /etc/mysql/mariadb.conf.d/

[mysqld_safe]
log_error=/var/log/mysql/mysql_error.log

[mysqld]
log_error=/var/log/mysql/mysql_error.log

one of my virtual servers is WordPress with a few plugins.

jimr1 · July 6, 2024, 10:57am

Try

grep -i 'killed process' /var/log/messages

See if that shows any results

jpour · July 6, 2024, 12:24pm

No result was found.

jpour · July 6, 2024, 3:46pm

I found the killed process

sudo journalctl --utc -b -ke
Out of memory: Killed process 3730122 (mariadbd) total-vm:3455056kB, anon-rss:954536kB, file-rss:0kB, shmem>

jimr1 · July 6, 2024, 3:56pm

Looks like a wordpress issue, some will disagree, but as a work around you could add a system monitor for mysql and get the service to restart when it goes down. Research webmin system monitors for more information and make sure you add the correct command to restart the service when it goes down, don’t ask me why wordpress does this to mariadb, I have a cloned site running with mysql rather than mariadb and it does not get OOM failures with similar user interactions. This is a work around not a fix

jpour · July 6, 2024, 4:11pm

Your feedback proved to be quite valuable.

Joe · July 6, 2024, 11:42pm

Beware that this will eventually result in data loss and/or database corruption (most likely the former). You can’t kill a database over and over without something getting broken. Starting it isn’t dangerous, but papering over that it’s being killed regularly by having it restart is just hiding a very dangerous thing. This is a case where you must find and solve the root issue, you can’t merely mitigate it with an automatic restart.

The only actual solution is to solve the memory problem.

Figure out what’s using a lot of memory, and reduce its usage via configuration or stop using it. We can’t guess what’s using all the memory based on the information we have…nothing points at a WordPress issue, thus far, but it could be.

@jpour you need to look at memory usage (check top or htop and sort by memory to see current highest memory users), and figure out what’s the most problematic.

If you don’t have swap, you should create and enable it, as a stop gap. That’s a much safer choice than letting Mariadb keep getting killed and then restarting it automatically. Swap memory is (extremely) slow, and it will hurt performance of your system to rely on it, but slow is better than data loss. So, add a few GB of swap space until you’re able to identify and resolve the memory issue(s).

verne · July 7, 2024, 4:08am

a year or two ago the main reason we adjusted all our virtual servers to use the php-fpm method was that gives good control on the number of child processes that are created - as we too often experienced OOM issues that took down MariaDB.

That is, under the older php framework (I forget the technical term), way too many child processes were being created, and that triggered OOM.

Joe · July 7, 2024, 4:45am

Yes, PHP-FPM is more efficient than other execution modes (the other options are fcgid+suexec, CGI+suexec, and mod_php, in order of best to worst, PHP-FPM is the recommended mode, and generally the default mode).

But, we’re guessing until OP looks to see what’s actually using all their memory.

shoulders · July 7, 2024, 8:16am

@Joe

Which is better for swap, A swap file or a swap partition?
Would you recommend 2 x system RAM for swap size?

Thanks

For reference:

I have Ubuntu 22.04 with 8GB RAM.
How to Add Swap Space on Ubuntu 20.04 - GeeksforGeeks

Joe · July 7, 2024, 8:43am

Doesn’t matter much. Partition is better, but it makes so little difference, I wouldn’t go out of my way to make it a partition if you didn’t create it during OS installation.

That depends on how much actual memory you have. You need enough memory (swap+real) to never run out. Preferably you rarely need to leave real memory, as it is catastrophic for performance, but not as catastrophic as running out of memory and having processes killed at random.

jimr1 · July 8, 2024, 3:36pm

perhaps then the webmin module System & Server Status should not offer the ability to restart a service if it goes down, or in fact not try to do anything if the service is unresponsive, It’s a webmin module authored by yourselves so surely it must respect that you can not restart if the service goes down/unresponsive ?

Stegan · July 8, 2024, 4:02pm

but the problem seems to be more about knowing where to look and finding out why. (then having an idea of how to go about fixing it) OOM issues aside the other potential problems could be related to other system issues like we have seen in these pages. (a plugin, out-of-date php/mysql, backup/restore issues, db design, indexing, etc)

jimr1 · July 8, 2024, 4:14pm

however if you setup the system monitor to restart the service,

this implies a webmin module is not doing what it should do

calport · July 8, 2024, 4:17pm

Virtualmin offers the ability to restart a service along with a notification to the administrator that the service has been restarted. I like this because when both these are used in combination, an administrator can offer high availability of services and at the same time is informed of the automated action that has been taken by Virtualmin, so that the cause can be investigated and addressed.

Absent this a service would go down and remain down till it has been manually restarted. I don’t think this would be acceptable to users who expect high uptime.

If the service keeps crashing every minute, it should not keep being automatically restarted and Joe has explained why.

jimr1 · July 8, 2024, 4:19pm

So do I but @Joe says this is a very dangerous thing to do

calport · July 8, 2024, 4:20pm

Joe said that it should not be done repeatedly. If the database server crashes every minute, then it should not be automatically restarted - that’s not the solution to the problem.

Joe never said that it should not be done at all. The fact that a service has stopped unexpected is bad - this is the point at which data loss occurs, not when it is restarted. Keep restarting it and if it keeps crashing, the chances of data loss become high.

jimr1 · July 8, 2024, 4:26pm

System Monitors will continue the service and take the action you have implemented on it’s time schedule so therefore it will be repeatedly restarted , if the service does not restart correctly at whatever schedule you have set, means the system monitor module will try again and therefore run into @Joe’s comment