Service monitor : Postfix Server down

jtomelevage · December 21, 2024, 7:02pm

SYSTEM INFORMATION
OS type and version	Debian Linux 11
Virtualmin version	7.30.2

We are getting email notifications daily at nearly the same time that the postfix server is down:

Monitor on {server_name} for 'Postfix Server' has detected that the service has gone down at 12/21/2024 10:55 AM

When I check the postfix server is is running so I presume it had restarted itself.

This started around the last virtualmin/webmin update.

Thanks for any suggestions.

Joe · December 21, 2024, 9:09pm

I would assume it never went down.

But, you can check. Look at systemctl status postfix and check how long it’s been up.

It’s possible it did restart if you have automatic updates enabled (every update for a service will cause it to restart).

jtomelevage · December 21, 2024, 10:26pm

@Joe - Thank you for your reply.

Here is what I get:

systemctl status postfix
● postfix.service - Postfix Mail Transport Agent
     Loaded: loaded (/lib/systemd/system/postfix.service; enabled; vendor preset: enabled)
     Active: active (exited) since Sat 2024-12-21 10:55:09 PST; 3h 28min ago
    Process: 3698507 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
   Main PID: 3698507 (code=exited, status=0/SUCCESS)
        CPU: 4ms

Dec 21 10:55:09 {server_name} systemd[1]: Starting Postfix Mail Transport Agent...
Dec 21 10:55:09 {server_name} systemd[1]: Finished Postfix Mail Transport Agent.

How can I see if automatic updates are enabled?

Joe · December 21, 2024, 10:55pm

That’s how long ago it restarted, so maybe it actually did stop. I don’t think that time is likely to coincide with automatic updates. Do you have out of memory errors in the kernel log? Could be the OOM killer.

Automatic updates on Debian are handled by the unattended-upgrades package. There could be other things, but that’s the most likely.

Also, what’s your system uptime? It’s possible for the service monitors to fire during a reboot at a time when a service would be down as a normal part of the reboot (it either goes down before the monitor or comes up after).

WoozyFace · December 23, 2024, 3:37pm

I started to experience the same here…

jtomelevage · December 23, 2024, 6:28pm

@Joe - I don’t think it is an OOM matter. Here is our dash info:

Real memory	8.59 GiB used / 1.61 GiB cached / 62.79 GiB total

Here is the uptime info:

System uptime	18 days, 2 hours, 29 minutes

This does not seem related to any recent reboot(s).

Joe · December 23, 2024, 8:33pm

Oh, another event that can trigger a restart is TLS certificate changes in Postfix, including automatic Let’s Encrypt certificate renewals that include the one that is being used for Postfix (or any edit in Virtualmin that effects the Postfix configuration, though not virtual map updates).

Joe · December 23, 2024, 8:36pm

And, just to be clear: Nothing is actually wrong right? Postfix is working, correct? You’re just trying to figure out why it stopped briefly?

jtomelevage · December 23, 2024, 11:04pm

@Joe - Yes, at least I cannot see anything that is not working.

This server has been running in it’s current configuration and on this VM for years. This message just started in the past month and since we’ve never seen this before I wanted to ask why.

I hear you on the TLS certificate question and It is possible that we see the error “around” when certificates are updated because there are 38 VMs on this server so certificate updates are pretty regular occurrences.
Next time, if there is a next time, that we see this notification I will pay closer attention if a certificate was also updated.

jtomelevage · December 28, 2024, 6:39pm

@Joe - We had some SSL certificates automatically update and the postfix server stopping notification did not happen.

We have not received the notification since my previous message.

jtomelevage · December 29, 2024, 6:18pm

@Joe - Well, I was wrong in my last post.

Yesterday one of the virtual servers certificate renewed at 01:04 PM and at 01:05 PM we received another postfix server stopped message. So maybe there is a relation after all?

Joe · December 29, 2024, 6:40pm

Yeah, that’s entirely possible. Postfix definitely restarts when certificates change, and that restart takes a moment (and having a bunch of TLS certs to load slows it down quite a lot). It’s harmless, though. Mail is a resilient protocol, if the server doesn’t respond for a few seconds, mail is just delayed for a few minutes and retried.

I’m not coming up with a perfect solution to make it not notify of a down server, if you’ve configured notifications for it, since it seems like it really is down at the time it’s checked in these events. But, I guess you probably want to make it only alert after two failures instead of just one.

jtomelevage · December 29, 2024, 7:01pm

@Joe - If the server is not stopped / down and is just restarting I think it’s no big deal.

Why do you think we only started to receive these messages in the past month or two?

If I set the “Failures Before Reporting” to 2 instead of 1 does that just mean that every 2nd certificate renewal I will receive a notification?

Joe · December 29, 2024, 7:21pm

Probably more certs making it slower to restart. Or busier mail server. Lots of things can make Postfix take a little longer to restart. You can restart it yourself manually to see how long it takes (though the queue is dynamic, and could be different every time you restart, and rapidly restarting one after the other will be faster as the queue will be mostly empty on the second restart). It’s normally pretty fast, though, so maybe you’re seeing clues of something wrong (like a lot of spam coming in or going out), and that may be a thing worth looking into.

A peak at the mail log or the journal for the postfix unit is never a bad idea.

No. It’s not catching it every time a cert is renewed, I’m sure. We’re just talking about a race condition here. The monitor happens to run when Postfix is in the middle of restarting sometimes. I can’t imagine that would happen two times in a row (the checks run every five minutes by default, Postfix will certainly be finished restarting in five minutes).

To be clear: If Webmin is running, it will run its status checks on schedule. It doesn’t know anything about why a service is down, it just sees it’s down and reports that. If it comes back up by the time of the next check five minutes later, and it only notifies on two failures, it won’t notify. If it doesn’t come back up in five minutes, or is somehow down again at exactly the time of the next check, it’ll notify. Whenever it is seen back up, the count restarts.

shoulders · December 29, 2024, 7:27pm

There was an update to email notification Options where the webmin default email would become an option. Perhaps this is now selected whereas before it was not?

jtomelevage · December 30, 2024, 5:52pm

@Joe - This server had 38 virtual servers running on it and this number has not changed for about 1 year so the number of certificates does not seem likely since this only started a month or so ago.

It seems that spam is a roller coaster in general. To me it looks like we go through spells of heavy spam and then maybe those spammers get filtered or close shop and then sometime later we see another rise as spam.

Your race condition explanation seems good to me and that the certificate renewal is coinciding with other activity.

I have noticed that our nightly backups have doubled in the time they take starting about a month or two ago. The full backup of 38 virtual servers used to take around 40 minutes and now they take 1 hour and 20 minutes. I thought that our hosting provider may have moved us to a slower or more congested server. If you think this may be a symptom of something else and / or related let me know if there is something I can check.

@shoulders - If a new email notification option was added, which one was it, and was it default enabled? Was this new email notification the one @Joe already mentioned?

shoulders · December 31, 2024, 11:02am

here you go

github.com/webmin/webmin

Configure a single email address to be used for notifications

opened 08:48AM - 12 Oct 24 UTC

closed 04:25AM - 18 Oct 24 UTC

shoulders

| SYSTEM INFORMATION || |----------|----------| | OS type and version | Ubun…tu Linux 22.04.5 | | Usermin version | 2.102 | | Virtualmin version | 7.20.2 Pro | | Theme version | 21.20.7 | | Apache version | 2.4.52 | | Package updates | 9 package updates are available | ## the issue you can set system notifications to be sent to emails in many places in Webmin (possibly Virtualmin aswell) and in each case you have to enter an email address. This means if you want to change the email you have to go to many areas and change the email address. **Some examples** Webmin --> Tools --> System and Server Status --> Scheduled Monitoring --> Email status report to ![image](https://github.com/user-attachments/assets/1e4ec4d7-38cf-4a26-a76c-584f0aeeeaea) Webmin --> System --> Software Package Updates --> Scheduled Upgrades ![image](https://github.com/user-attachments/assets/e76e5cae-a4a6-483f-9da4-ab82c46e480f) ## proposed solution This is a 2 part solution (+ a virtualmin option) - Add a new section where you can the notification emails. I have picked a locatin but I am not fixed to it. - In virtualmin possibly set this email during the installer - update the email fields where required. I will just do one email below ### in webmin --> webmin configuration --> Sending Email ![image](https://github.com/user-attachments/assets/6dfd9299-09d0-4683-92eb-e77539e42075) ### In Webmin --> System and Server Status --> Scheduled Monitoring --> Email status report to ![image](https://github.com/user-attachments/assets/c902d119-f72a-458e-a26b-5b8f3cf3e5a2) - so here you can use the default system notification email or set one of your own. - The default system notification should be selected by default ## additional - Maybe also add the **From: address for email** filed to the new setting. Again this will unify the notifications. - Most monitors should be set by default to send an email and if there is a an email present in the notification field then one will be sent, if there is no emaik then no email will be sent.

Don’t know

system · March 1, 2025, 11:03am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.