Usermin randomly stops working across all Ubuntu 20.04 servers

@Joe,

My primary point was, I find that when you try to cram everything on a single “small” VPS, you are not giving everything enough resources to run “properly”. I see people placing full stacks on 1GB or 2GB systems, then they load the same system with a bunch of busy (or unoptimized) WordPress sites and start to see issues with overall performance. A quick look at “htop” shows they’re spiking their resources regularly and they wonder why services start to shutdown… :slight_smile:

*** I’ve also noticed lately MySQL tends to chew through a ton of CPU/Memory when not setup correctly or you start loading lots of database heavy scripts. ***

Happened again on another server, this one has 16 GB RAM. Found good instructions how to monitor using monit:

apt install monit
vi /etc/monit/conf-available/usermin

Add this contents to the new file usermin you are editing:

check host usermin with address 127.0.0.1
start program = "/bin/systemctl start usermin"
stop program = "/bin/systemctl stop usermin"
if failed port 20000 then restart
if 5 restarts within 5 cycles then timeout

Link the file:

ln -s /etc/monit/conf-available/usermin /etc/monit/conf-enabled/

Check the syntax and reload monit

monit -t
systemctl reload monit

Typical log event:

[SAST Dec 30 08:13:53] error : 'usermin' failed protocol test [DEFAULT] at [127.0.0.1]:20000 [TCP/IP] -- Connection refused
[SAST Dec 30 08:13:53] info : 'usermin' trying to restart
[SAST Dec 30 08:13:53] info : 'usermin' stop: '/bin/systemctl stop usermin'
[SAST Dec 30 08:13:54] info : 'usermin' start: '/bin/systemctl start usermin'
[SAST Dec 30 08:15:56] info : 'usermin' connection succeeded to [127.0.0.1]:20000 [TCP/IP]

I had the same problem, and it did turn out to be out of memory issues. Until I could deal with that, I used Webmin’s System and Server Status to check and restart the usermin service.

Create a new monitor.

Commands to run > If monitor goes down, run command:

systemctl restart usermin

Monitored service options > Command to run:

nc -z -v localhost 20000

Monitored service option > Exit status check:

Fail monitor if command fails

2 Likes

yes, i have same problem.

sorry but i didn’t had time to check…
for now i solve with System and Server Status, like keenmouse suggests.

i have this probem on debian 10 / almalinux 8

thank you

Y’all need to look at the Usermin miniserv.error log and the kernel log (for OOM killer messages) to find out why it’s exiting. Usermin does not crash. So, something is killing it.

Restarting it is just masking whatever problem your system has.

@Joe I can now semi reliably make it stop working.

kernel log (for OOM killer messages)

Any cluets? I tried this on Ubuntu:

cat /var/log/syslog | grep oom

I also carefully looked at Syslog but I don’t see anything to do with memory or Usermin.

So I need some help how to detect out of memory events on a Ubuntu Linux to see if this is causing Usermin to stop working.

Just to be clear about how I make it “crash”:

  • I wait a few days.
  • I try to access Usermin
  • It stops immediately on first accexss
  • I start it again. It works fine, for a while

Well I found this:

grep oom /var/log/*
grep total_vm /var/log/*

Not seeing anything though. Are we 100% sure Usermin OOMs would be logged and to which file?

I can reliably reproduce this issue now and would love to get it fixed. Clients rely on Webmail at critical times and even though monit is helping this is a delay of up to a minute where the client starts loosing trust.

i’m not sure is related, but i’m using csf firewall on my virtualmin servers.

and one week ago i try to change this

nano /etc/csf/csf.pignore
cmd:/usr/bin/perl /usr/libexec/usermin/miniserv.pl /etc/usermin/miniserv.conf

restart csf and usermin

for now usermin works fine (even debian and almalinux)
but may is an update to solve issue…

thank you

@vander.host what is the ram and cpu on that server (no swap please - just real ram value) ?

@unborn RAM + CPU

16 GB RAM
7 vCPUs
Note: 300+ domains with lots of mailboxes, probably > 1000