I don't believe it! LE problems again

SYSTEM INFORMATION
OS type and version Ubuntu Linux 20.04.6
Webmin version 2.101
Usermin version 2.001
Virtualmin version 7.8.2
Theme version 21.06
Package updates All installed packages are up to date
Box name recycling (UK)

Background:
New box with provide Linode (I have another with them)
transferred domain from existing provider (I was managing on a different box/provider) at a different location (US)
changed the domain at the registrar so NS point to linode box.
added the domain at new provider. Checking that the A/AAA/MX records are correct
checked propagation (ping & DNS Checker) - all good
installed Virtualmin - no errors.
Added the VS domain (our-recycle.co.uk)

So Added LE SSL
again errors :astonished:
`Plugins selected: Authenticator webroot, Installer None
Obtaining a new certificate
Performing the following challenges:
http-01 challenge for mail.our-recycle.co.uk
http-01 challenge for our-recycle.co.uk
http-01 challenge for www.our-recycle.co.uk
Using the webroot path /home/our-recycle.co.uk/public_html for all unmatched domains.
Waiting for verification…
Challenge failed for domain mail.our-recycle.co.uk
Challenge failed for domain our-recycle.co.uk
Challenge failed for domain www.our-recycle.co.uk
http-01 challenge for mail.our-recycle.co.uk
http-01 challenge for our-recycle.co.uk
http-01 challenge for www.our-recycle.co.uk
Cleaning up challenges
Some challenges have failed.
IMPORTANT NOTES:

What have I forgotten this time?

If thats the domain I can’t reach it.

It’s always the same three problems:

  1. DNS. Does this name point to the right IP for your server?
  2. Virtual host. Is your server serving the right public_html directory for this hostname? (“The wrong site shows up” in our Website Troubleshooting – Virtualmin guide)
  3. Redirects/proxy rules. If you have redirects or proxy rules, you must exclude the .well-known directory. To test this, put a file in /home/domain/public_html/.well-known and try to fetch it. If you can’t fetch it, you need to fix that.

Let’s Encrypt makes a web request. If you can’t make a web request for objects in .well-known neither can LE. So…fix that, just like you would any other website problem: Check DNS, try it in a browser, look in the error and access log.

1 Like

You are right it always is

  1. the DNS is managed on the provider not under BIND (the A/AAA records agree with the ones “suggested” by Virtualmin (except the ftp. and localhost. ones). DNS Checker return the correct A/AAAA/NS/MX

  2. and 3. This is a new (LEMP) install on a new clean box I have done nothing in Virtualmin yet other than login (using the IP:10000) and Create Virtual Server then go in to Server Configuration -> SSL Certificate -> Lets Encrypt -> Automatically Renew (Yes)

Done this so many times now on other boxes (and the previous (USA) based box, it is becoming a routine.

Hmmm me neither! But I can access and login to Virtualmin with the IP. I will dig deeper. That tells me that at least the IP is resolving to the box.

Well that was a little adventure.

the webserver (nginx) was down and refused to restart using the dashboard.

running nginx -t gave me this:

a .lock file - now where did that come from? (I have not edited anything) did this get left behind when the VS was added?

Anyway deleting the domain.conf.lock file and rerunning the nginx -t succeeded, and the nginx restarted. requesting LE cert of course then worked.

I do remain puzzled as to why/how/when a .lock file was created and not removed. but also why nginx even thought it was an enabled site (it might have been in the sites-enabled directory but it doesn’t end with .conf

In such situations first quickly check if Bind9, Apache/Nginx are running and if ports 53 and 443 are listening.

I have a VM where errors in getting a LE certificate are not allowed by the beneficiary. I have to create my own certificate in the CLI from time to time, using the local DNS. I copy and paste the values provided by LE in the BIND configuration, restart the service then in CLI I finish the steps with LE. The certificate is generated quickly and things move on. A cronjob would work well, but if errors occur and are not noticed immediately by a human operator, there are problems with the beneficiary. We guarantee by contract that 100% the website will have a valid certificate when is up.

For those who want to get a LE certificate manually, here are the steps I am using:

  1. Open two terminal windows side by side. The left will be used for LE command, the right for editing the BIND9 configuration and restarting the service.

  2. Run the next command for LE

certbot certonly --manual --agree-tos -m valid-email@domain.tld --preferred-challenges dns-01 -d domain1.tld -d domain2.tld --renew-by-default --rsa-key-size 4096

  1. LE will provide you step by step strings for all the requested domains. Copy and paste them, line by line, editing this file /var/lib/bind/domain.tld.host.

_acme-challenge.domain1.tld. IN TXT “STRING1_FROM_LE”
_acme-challenge.domain2.tld. IN TXT “STRING2_FROM_LE”

  1. If there are no other strings from LE save the file and restart the BIND9 service.

  2. Finish LE process with success. You should get your certificate. I am using the place where the certificates are stored with Postfix, Dovecot, Apache, … Don’t forget to restart these services after generating a new certificate.

The Virtualmin team did a great job to get the certificates, but there are too many reports where people are complaining about issues. People should learn the manually way too.

I agree. But that is a step backwards. one of the things that brought me to Virtualmin (and one of many) was the ease of managing SSL and the hosting providers charge for something like obtaining a certificate. Also in this case BIND is not involved DNS is managed pretty well on the box provider.

I’m still confused as to where the lock file originated.

“It is essential in multi-user systems to avoid conflicts when multiple processes try to access the same file simultaneously. In Linux, file locking is implemented through the use of locks. Lock prevents other processes from accessing a file until the lock is released”

Source: https://www.tutorialspoint.com/introduction-to-file-locking-in-linux

Sure I guessed that but it doesn’t answer the question.
This is a new box and new installation (so only one user) probably several processes take place during install one of which must opens and alters the nginx configuration (certainly during adding a VS) perhaps that is where the problem lies. Perhaps another part of that addition is trying to work with the file or creating an error in it that nginx restarting has a problem with. Perhaps all the fastcgi php gunk that is not required (convenient for users/future users of php) but not something I would bother with if I was creating a nginx conf file

In a nut shell:
Yes it does answer your question.
The applications installed on your Linux system require user/group making it a “multi-user” platform. Applications will lock a file while writing to it and unlock when finished.

It happens to all of us once in while.

but something (some process) didn’t. If it was the process of generating the .conf file then there is a problem with it. or the process to restart the nginx needs delaying until it has finished.
If I was generating a .conf file (as I have in the past) I would stop nginx, create the new conf file, run nginx -t to check it, then restart. (I understand that is not so easy to do when there are multiple VS running on the box - again there is only 1 on this box)

There could be a number of reasons why the file wasn’t removed after it finished its process.
Check your logs.

Like a said, It happens to all of us once in a while…

First time in over half a century and I thought things were getting faster not slower :slight_smile:

This topic was automatically closed 8 days after the last reply. New replies are no longer allowed.