Dovecot Failed state

That script is whatever your OS provides (a systemd unit file, generally), and it is likely different across distributions. I’m reasonably confident it is not that.

But, I want to be clear that we don’t have a custom dovecot service/unit file here. We aren’t taking over the Dovecot installation and replacing pieces. We’re just modifying the config files and using the OS-provided systemctl reload|restart dovecot or whatever. I suspect the problem is our misuse of ssl_ca directives, but we won’t know until someone actually tests that theory (I don’t have this problem on my servers for unknown reasons, so I can’t test) or until the new version goes out and it’s changed for everyone. (But, y’all could test this theory by making the change and trying both restart and reload to see what happens.)

I got a debian 9 server with a Dovecot in failed state but email still working:

root@green ~ # service dovecot status
● dovecot.service - Dovecot IMAP/POP3 email server
   Loaded: loaded (/lib/systemd/system/dovecot.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Thu 2020-06-18 22:13:24 CEST; 1 weeks 1 days ago
     Docs: man:dovecot(1)
           http://wiki2.dovecot.org/
  Process: 17179 ExecStop=/usr/bin/doveadm stop (code=exited, status=75)
  Process: 1099 ExecStart=/usr/sbin/dovecot (code=exited, status=0/SUCCESS)
 Main PID: 1146 (code=exited, status=0/SUCCESS)

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
root@green ~ # systemctl reload dovecot
dovecot.service is not active, cannot reload.
root@green ~ # systemctl restart dovecot
Job for dovecot.service failed because the control process exited with error code.
See "systemctl status dovecot.service" and "journalctl -xe" for details.

reload and restart both fail. Is there anything I can do to help debug the problem? I am currently just ignoring it because the email keeps working.

Howdy,
@Joe from Virtualmin.
I understand Joe, it is problematic that this issue is not reproducable for the devs’.
So here is the commands I’ve been running when my Certs are updated:

for i in ps aux | grep dovecot | awk '{print $2}' ; do kill -9 $i ; done

systemctl restart dovecot.service

This will get Dovecot running again evertime. I don’t understand why this works, but it does.

Hope this helps

OK after reading some more above, I commented out all .ca lines in de Dovecot conf file and restarted dovecot. Hopefully this helps.

Hi,
I confirmed that dovecot enters failed state without “ssl_ca” lines 10 days ago - these lines are not causing problems.

How can the script/task that causes problems be executed manually? I could not find cronjob for it.

I’ve been dealing with this for the last 7 months, on 3 different servers. 1 is Ubuntu 18, others are both CentOS 7. System sends me an email when my CPU is above 40% for 3 minutes, and that’s how I know Dovecot is f*cked (It’s all the way at 99% when Dovecot is in failed state). Dovecot is definitely not playing nice with Webmin renewing LetsEncrypt.

Temp solution is to reboot. Thank god all my servers are on SSDs, rebooting is quick so I reboot every here and there.

Webmin team definitely need to look into the integration of Dovecot and LetsEncrypt :slight_smile:

JamieCameron found something last year. Maybe it is connected with our problem?

Submitted by JamieCameron on Mon, 12/02/2019 - 11:25

I found a bug that can cause Dovecot to not get restarted on cert renewal - I’ll fix it in the next Virtualmin release.

https://www.virtualmin.com/comment/820295#comment-820295

Maybe he can help?

Jamie is, obviously, working on it (Ilia, too). The next version fixes several issues in Dovecot cert handling. We still do not know if any of those issues are the cause of this specific problem (the issue being dovecot in failed state, but still maybe working).

Things that have been fixed:

  1. Misuse of ssl_ca
  2. Leftover extraneous ssl_ config directives when deleting domains
  3. Some other cert related issues, where config might not match reality

The lack of restart is obviously not the cause of this problem, as there has already been a release since that change was made (so the current version of Virtualmin is restarting Dovecot on cert renewals). And, in fact, it’s likely that update is what made this problem, whatever its cause, much more apparent since it would cause it to be triggered on every cert update.

Jamie and Ilia have been doing a huge amount of work on cert handling in the mail stack over the past several weeks. I’m hopeful the next release, and one or more of the Dovecot-related fixes, will resolve this issue. There have been a bunch of confounding factors that has made it hard to push out an update to fix this one issue (especially since we still don’t know the actual cause, we’re just assuming/hoping that one of the things that has been fixed will resolve it).

2 Likes

What i am wondering is if a system where clients using IMAP is more or less likely to experience this problem than a system setup with clients using POP3?

Something is niggling in my head suggesting its more likely with IMAP, but maybe im just imagining things!

No. Absolutely not.

Hello people, I see the latest update is available and SSL problems should be fixed now. Please test it and report if anything goes wrong. I will test it too.

Thank you virtualmin team!

I can confirm that some of my letsencrypt certs were automatically renewed in last few days and dovecot is still running with Active status :star_struck: :star_struck: :star_struck:

image

Update fixed issues.

Cheers :smiley:

1 Like

I still have those unclosed parentheses bug in dovecot config after any domain lest encrypt renewal in the latest version. :frowning: