Hello!
I’m sorry to come asking for help, but I find myself at my wit’s end. Could I humbly ask for suggestions, perhaps?
the server information
I have a Debian 10 dedi, running Virtualmin free, latest stable as provided by the automatic updates, behind cloudflare, relying on letsencrypt for its SSL part, serving quite a few websites for friends, for myself, and more. So far so good, no issues in the last months, websites, email, shell, all working properly.
the bug’s details
I have a brand new bug on my dedi, it’s been a few hours now, and yet things were working just yesterday: while receiving emails still works (both in pop3 and imap), I cannot SEND them anymore.
My error messages are in French, my apologies if it’s not the exact terms you’d find in English, I hope you’ll cross a part of the bridge and guess what they ought to have been when I get them wrong.
Thunderbird returns the error message that it cannot operate a secure connection with its peer, stating the requested domain name doesn’t match the certificate on the server, hence the configuration for mail.domain.tld must be corrected.
Then follows another dialogue, in which TB offers to add an exception to the rules, in which there’s also the button to manually request, again, the certificate.
And, something certainly odd in my eyes, when I click that button… it seems to fail to request the certificate!
The part that said “bad site” and explained the certificate doesn’t match the site, is replaced, now, with a new text appearing, saying “no available information”, “impossible to obtain that site’s identification information”.
I’ve confirmed that with two different domains on that server, I think all domains are concerned at this point.
what I attempted to resolve the issue
I’ve tried those operations:
(1) service restart postfix,
(2) service restart dovecot (I know it’s a postfix thing, not dovecot, but why not at this point, right?),
(3) I successfully requested a brand new letsencrypt certificate for the domains in which I confirmed the issue,
(4) I made sure that in virtualmin’s dashboard there was no status icon in the red… nothing.
(5) I found no relevant .conf file for postfix but at least I made sure dovecot’s conf file was unchanged.
(6) I also tried to check /var/log/err.log - nothing and /var/log/mail.log
But for mail.log unfortunately, let’s blame bots trying stuff I suppose, that log file isn’t understandable to me as it is, many lines per second are added, too many for me to keep track. Even if I use tail -f and quickly go back to thunderbird to hit Sent and Obtain certificate, it’s not fast enough, either no trace is recorded in mail.log, or it’s lost in the rest. Searching my own IP adress in there only returns the successful dovecot logins, not postfix failures (which shouln’t be much of a surprise, postfix comes from the server itself now that I think of it).
(7) Restart fail2ban on the system (which purges its list of banned IPs), in case a crucial IP address had been added to the block list by accident, and run again the previous tests
(8) as the websites are behind cloudflare, I tripled-checked, no change was made to Cloudflare’s configuration regarding the dedi in the last weeks; their support person, when contacted, also confirmed no change was made on their side, and they don’t see network problems on their own network’s monitoring.
(9) the last thing I could test myself, without inspiration (I googled quite a bit, but found nothing that I think is relevant here) would be to reboot the server, but I don’t think it’s quite wise, what if there is an actual issue that will become worse after a reboot, if some software bricks can’t work anymore with others, etc…
Would you have an idea of what may cause the problem, or in what direction I could investigate?
Maybe another error log with recognizable text patterns to search for?
My apologies if that’s a newbie question… that’s what I am
Thank you very much if you can help, and merry holidays guys!