After running it never created the core file. (and I did try other directories, being a production version of NGinx there’s a chance it was not compiled for debug)
I created a text file with all the debug messages from the error.log, (2.5 Meg) but I cannot see a way of attaching the file, not that the contents looked particularly useful. I did a search for connection / error / socket … not mentioned once in the entire log file.
Thank you for your help trying to find out what’s wrong. This is painful for me as I’m in an area of the system I know nothing about.
The problem is not “why is it not starting?” but more about “what is preventing it running?” isn’t it.? or more precisely what is stopping it prior to a restart.
If it starts using Virtualmin GUI then it is going to reveal the same information as a systemctl restart nginx the nginx log will only show the successful process not what killed it.
Nginx is a pretty robust webserver it doesn’t die due to a mundane process like log rotate. it needs something more critical to stop it (a reboot?) Shouldn’t we be focussing on what stopped it in the first place?
@Joe@Stegan I don’t know if this will help or not, but I tried a small experiment. I created a new log file rotate and added each log file one by one.
It worked correct until I hit the 6th log file to rotate, then it died. FYI on my server there are 20 log files to rotate in the /var/log/VirtualMin directory.
Just in case here is what systemctl status nginx.service had to say
× nginx.service - A high performance web server and a reverse proxy server
Loaded: loaded (/lib/systemd/system/nginx.service; enabled; preset: enabled)
Active: failed (Result: start-limit-hit) since Sun 2023-09-17 09:44:33 MDT; 15s ago
Duration: 6ms
Docs: man:nginx(8)
Process: 1411183 ExecStartPre=/usr/sbin/nginx -t -q -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
Process: 1411184 ExecStart=/usr/sbin/nginx -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
Process: 1411213 ExecStop=/sbin/start-stop-daemon --quiet --stop --retry QUIT/5 --pidfile /run/nginx.pid (code=exited, status=0/SUCCESS)
Main PID: 1411185 (code=exited, status=0/SUCCESS)
CPU: 130ms
given that this happens in the wee dark hours is there a reboot being triggered by something?
To eliminate the log rotate process (I still can’t believe that trivial process is killing nginx) and presumably we have established that the event is reproducible. Can’t we just stop all log rotation and start them one by one at a more convenient time that can be monitored. Assuming that only the default log rotates are active why are we (nginx users) seeing this?
I’ve located these references. Seems logrotate reloads apache2, and presumably nginx, for some reason. Maybe to reconstruct an empty logfile? Maybe a restart would work? Kind of a kludge and just a guess.
@Stegan@Joe … I read Id1ot’s post with the sleep on the reload.
I first added : sleep 5 - ran the file rotation on just Virtualmin logs and it went through and Nginx was still running at the end. I tried sleep 1, NGinx dead at end / sleep 2, NGinx alive at end.
Playing safe I’ve set it to 4. - That said, if you need me to do any more tests for you not a problem.
Once again, thank you all for thinking about this issue, sorry I was not more help in debugging. Nigel.
Wow! that is a surprise. I am also amazed that this has not shown itself on other systems. The fact it is restarting nginx is good but why is it stopping it in the first place? I now wonder if the timing issue may be more related to available cores/memory the log rotation taking longer to complete than normal
Are we saying that the logging process actually stops nginx (therefore has to try to restart it)
or is the logging process (including the log rotation) takes so long that it needs to suspend nginx and therefore restart it.Are the logs that big?
5 seconds is a long time for a busy site (just think of all the frustrated users and lost orders)