Let’sEncrypt Renewal Failed

SYSTEM INFORMATION
OS type and version Debian 12.9
Webmin version 2.202
Virtualmin version 7.30.4
Webserver version Nginx/1.22.1
Related packages certbot 3.1.0

After some back-and-forth with Lets Encrypt and some late-night realizations, I’ve concluded that after upgrading certbot via snap on my system, whatever process owns and/or calls certbot for renewals is doing something wrong.

I’ve verified that my regular unprivileged user and the user who owns this website can call certbot without issue. Still, I suspect that Virtualmin doesn’t like calling Snapd-versions of applications.

TL;DR
You broke your system.

Long version
It seems that by upgrading Certbot via Snapd, you’ve replaced the version shipped with your distribution, which likely broke the integration Virtualmin relies on.

Since Snapd-managed applications often differ in how they’re invoked or interact with the system, Virtualmin might not fully support them.

To resolve this, you’ll likely need to manually review your configuration files and adjust them to ensure compatibility with the Snapd version of Certbot. Alternatively, reverting to the distribution’s packaged version might simplify things…

Godspeed!

It isn’t accurate! Virtualmin doesn’t care! The snapd version of Certbot works perfectly with Virtualmin, at least it did with Ubuntu 24.04.

Okay, then I stand corrected. I once blew my 22.04 with a snapd-updated certbot, hence assumed it was the same here. @xaero, ignore my initial response.

What I said assumes that the system has either version of certbot, but not both.

@Steini no worries, we all make mistakes.

@Ilia do I need to apt remove the old certbot? I renamed the binary initially, so there were no immediate conflicts.

I’d suggest removing the snapd version of certbot if the apt version is available.

It is available in an older version (2.1.0 vs. the snaps version, which is 3.1.0). They were the ones who suggested I go this route in this specific post.

I took a quick look, and the whole discussion seems off track.

I ran a test on my Debian 12, and I didn’t find any issues with certbot installed from both APT and snap coexisting. The reason Webmin isn’t using it, most likely is that /snap/bin isn’t in the PATH. You can set it manually on the “Webmin ⇾ Webmin Configuration: Operating System and Environment” page.

However, I’d like to point out again that I don’t believe you really need the snap version of certbot.

OK, I removed the snap version of certbot and reinstalled Deb12.9’s version.

root@vulture:~# certbot --version
certbot 2.1.
root@vulture:~#

When I try to request a renewal by going to Manage Virtual Server -> Setup SSL Certificate -> SSL Providers and then clicking Only Update Renewal, the page refreshes and goes back to Current Certificate, and it still tells me it’s still expired.

Now, OTOH, if I click Request Certificate from the SSL Providers page, it will attempt to get a cert, but fail with the below error message.

Checking hostnames for resolvability …
… all hostnames can be resolved

Requesting a certificate for grunk.xyz, www.grunk.xyz, admin.grunk.xyz, xaerolimit.net, www.xaerolimit.net from Let’s Encrypt …
… request failed : Web-based validation failed :

Saving debug log to /var/log/letsencrypt/letsencrypt.log
Renewing an existing certificate for grunk.xyz and 4 more domains

Certbot failed to authenticate some domains (authenticator: webroot). The Certificate Authority reported these problems:
  Domain: admin.grunk.xyz
  Type:   connection
  Detail: 2001:19f0:c:d51:5400:4ff:fe7c:fb7d: Fetching https://grunk.xyz:10000/.well-known/acme-challenge/IvaVhWIYu-LLWbsPHpnvChzKfjNj2hcdWMcqAIZNYf4: Invalid port in redirect target. Only ports 80 and 443 are supported, not 10000

Hint: The Certificate Authority failed to download the temporary challenge files created by Certbot. Ensure that the listed domains serve their content from the provided --webroot-path/-w and that files created there can be downloaded from the internet.

Some challenges have failed.
Ask for help or search for solutions at https://community.letsencrypt.org. See the logfile /var/log/letsencrypt/letsencrypt.log or re-run Certbot with -v for more details.

DNS-based validation failed :

Saving debug log to /var/log/letsencrypt/letsencrypt.log
Renewing an existing certificate for grunk.xyz and 4 more domains
Hook '--manual-auth-hook' for admin.grunk.xyz reported error code 255
Hook '--manual-auth-hook' for admin.grunk.xyz ran with error output:
 Failed to update DNS records : 
 An error occurred (InvalidChangeBatch) when calling the ChangeResourceRecordSets operation: [The request contains an invalid set of changes for a resource record set 'CAA grunk.xyz.', The request contains an invalid set of changes for a resource record set 'MX grunk.xyz.', The request contains an invalid set of changes for a resource record set 'TXT grunk.xyz.', The request contains an invalid set of changes for a resource record set 'A ns.grunk.xyz.', The request contains an invalid set of changes for a resource record set 'AAAA ns.grunk.xyz.']

Certbot failed to authenticate some domains (authenticator: manual). The Certificate Authority reported these problems:
  Domain: admin.grunk.xyz
  Type:   unauthorized
  Detail: No TXT record found at _acme-challenge.admin.grunk.xyz

Hint: The Certificate Authority failed to verify the DNS TXT records created by the --manual-auth-hook. Ensure that this hook is functioning correctly and that it waits a sufficient duration of time for DNS propagation. Refer to "certbot --help manual" and the Certbot User Guide.

Some challenges have failed.
Ask for help or search for solutions at https://community.letsencrypt.org. See the logfile /var/log/letsencrypt/letsencrypt.log or re-run Certbot with -v for more details.

The folks over at Lets Encrypt agreed that it’s failing because it’s requesting the verification on a port that certbot cannot use (as they have stated, it can only request that verification via port 80 and/or 443.

So this is where I am at and what spawned this.

Something is redirecting requests away from admin.xaerolimit.net, and it shouldn’t happen. In Virtualmin, Apache is explicitly configured to avoid redirecting requests starting with .well-known by adding a rule like this:

RewriteCond %{HTTP_HOST} =admin.xaerolimit.net
RewriteRule ^/(?!.well-known)(.*)$ https://xaerolimit.net:10000/ [R]

Double-check to ensure you didn’t accidentally delete that rule.

An .htaccess file could also have directives that are sucking up those requests.

There are many ways to break access to .well-known

@xaero you can test whether it will work just by putting a file in /home/domain/public_html/.well-known/whatever.html and browsing to .well-known/whatever.html and you should test that way before trying to renew via Let’s Encrypt again, since repeated failures will get you blocked for a time.

1 Like

Host is running nginx, none of the host dirs have a .well-known or .htaccess. The redirects are happening somewhere else. I know Virtualmin created admin.grunk.xyz when the root domain was added, and I’m almost positive that it’s virtualmin handling that re-direct because http://admin.grunk.xyz is automagically redirected to https://admin.grunk.xyz:10000. But I can’t figure out where that’s happening.

.well-known is created automatically. Just make that directory and put something in it, so you can test.

And, since it’s nginx, the redirect rules look different, but they should exclude .well-known

So, check the nginx config for this domain. Show us what you’ve got.

OK, here we go

root@vulture:~# cat /etc/nginx/sites-enabled/grunk.xyz.conf
server {
        server_name grunk.xyz www.grunk.xyz mail.grunk.xyz webmail.grunk.xyz admin.grunk.xyz xaerolimit.net www.xaerolimit.net mail.xaerolimit.net;
        listen 66.135.10.87;
        listen [2001:19f0:c:d51:5400:4ff:fe7c:fb7d];
        root /home/grunk.xyz/public_html;
        index index.php index.htm index.html;
        access_log /var/log/virtualmin/grunk.xyz_access_log;
        error_log /var/log/virtualmin/grunk.xyz_error_log;
        fastcgi_param GATEWAY_INTERFACE CGI/1.1;
        fastcgi_param SERVER_SOFTWARE nginx;
        fastcgi_param QUERY_STRING $query_string;
        fastcgi_param REQUEST_METHOD $request_method;
        fastcgi_param CONTENT_TYPE $content_type;
        fastcgi_param CONTENT_LENGTH $content_length;
        fastcgi_param SCRIPT_FILENAME "/home/grunk.xyz/public_html$fastcgi_script_name";
        fastcgi_param SCRIPT_NAME $fastcgi_script_name;
        fastcgi_param REQUEST_URI $request_uri;
        fastcgi_param DOCUMENT_URI $document_uri;
        fastcgi_param DOCUMENT_ROOT /home/grunk.xyz/public_html;
        fastcgi_param SERVER_PROTOCOL $server_protocol;
        fastcgi_param REMOTE_ADDR $remote_addr;
        fastcgi_param REMOTE_PORT $remote_port;
        fastcgi_param SERVER_ADDR $server_addr;
        fastcgi_param SERVER_PORT $server_port;
        fastcgi_param SERVER_NAME $server_name;
        fastcgi_param PATH_INFO $fastcgi_path_info;
        fastcgi_param HTTPS $https;
        location ^~ /.well-known/ {
                try_files $uri /;
        }
        location ~ "\.php(/|$)" {
                try_files $uri $fastcgi_script_name =404;
                default_type application/x-httpd-php;
                fastcgi_pass unix:/run/php/1728223916222655.sock;
        }
        fastcgi_split_path_info "^(.+\.php)(/.+)$";
        if ($host = webmail.grunk.xyz) {
                rewrite "^/(.*)$" "https://grunk.xyz:20000/$1" redirect;
        }
        if ($host = admin.grunk.xyz) {
                rewrite "^/(.*)$" "https://grunk.xyz:10000/$1" redirect;
        }
        listen 66.135.10.87:443 ssl;
        listen [2001:19f0:c:d51:5400:4ff:fe7c:fb7d]:443 ssl;
        ssl_certificate /etc/ssl/virtualmin/1728223916222655/ssl.combined;
        ssl_certificate_key /etc/ssl/virtualmin/1728223916222655/ssl.key;
        rewrite /awstats/awstats.pl /cgi-bin/awstats.pl;
        client_max_body_size 5M;
}
root@vulture:~#

This is entirely what Virtualmin has done, I have not touched these configs.

Please create a file and test accessing files in .well-known, as I recommended earlier.

Oh I did that and forgot, I can see the file I created here.

Ah, OK, you’re right, the redirect is still happening for the admin subdomain even when the file exists. That shouldn’t happen and it’s confusing. There’s obviously something I don’t understand about nginx configuration (probably many things, as I don’t use nginx very often).

I’ll have to make a test system to experiment with this if nobody else can spot why this is failing to behave as the configuration looks like it ought.

Agreed, it passes its internal conf check before (re)starting. So it’s valid. I can’t figure out why it’s processing this redirect.