Apache restarting on a Sunday morning

pixel_paul · February 8, 2010, 2:41pm

Got a strange problem here that has been happening for the past few weeks on a Sunday morning at 0400 - but not every Sunday.

What happens is that Apache shuts down and then cannot restart.

This is the message in /var/log/messages:

Feb 7 04:02:15 DS-R1062667-001 syslogd 1.4.1: restart. Feb 7 04:02:59 DS-R1062667-001 logrotate: ALERT exited abnormally with [1]

And then httpd error log:
[Sun Feb 07 04:02:15 2010] [notice] Digest: generating secret for digest authentication … [Sun Feb 07 04:02:15 2010] [notice] Digest: done PHP Warning: Module ‘imap’ already loaded in Unknown on line 0 [Sun Feb 07 04:02:15 2010] [notice] mod_python: Creating 4 session mutexes based on 256 max processes and 0 max threads. [Sun Feb 07 04:02:16 2010] [notice] Apache/2.2.3 (CentOS) configured – resuming normal operations [Sun Feb 07 04:02:16 2010] [notice] Graceful restart requested, doing restart [Sun Feb 07 04:02:16 2010] [error] (9)Bad file descriptor: apr_socket_accept: (client socket) [Sun Feb 07 04:02:17 2010] [notice] Digest: generating secret for digest authentication … [Sun Feb 07 04:02:17 2010] [notice] Digest: done

repeated about 15 times, then this:

Sun Feb 07 04:02:58 2010] [emerg] (28)No space left on device: mod_fcgid: Can’t create global mutex [Sun Feb 07 04:02:59 2010] [crit] (28)No space left on device: mod_rewrite: could not create rewrite_log_lock Configuration Failed [Sun Feb 07 04:02:59 2010] [crit] (28)No space left on device: mod_rewrite: could not create rewrite_log_lock Configuration Failed [Sun Feb 07 04:02:59 2010] [crit] (28)No space left on device: mod_rewrite: could not create rewrite_log_lock Configuration Failed [Sun Feb 07 11:07:14 2010] [crit] (28)No space left on device: mod_rewrite: could not create rewrite_log_lock Configuration Failed [Sun Feb 07 11:08:23 2010] [crit] (28)No space left on device: mod_rewrite: could not create rewrite_log_lock Configuration Failed [Sun Feb 07 11:17:52 2010] [crit] (28)No space left on device: mod_rewrite: could not create rewrite_log_lock Configuration Failed

This has me a bit lost as to where to look for source of problem, so any pointers would be much appreciated.

Thanks,

Paul

Eric · February 8, 2010, 2:49pm

Hi Paul,

It looks like Apache is generating space related errors.

Are any of your drives low on space?

You can see available space by typing this:

df -h

Since Apache’s restart on Sunday is generally around the same time as the logrotation, it may be that whatever partition /var is on is having the space problems.

-Eric

pixel_paul · February 8, 2010, 3:14pm

Hi Eric,

[root@]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
69G 13G 52G 20% /
/dev/sda3 99M 36M 58M 39% /boot
tmpfs 506M 0 506M 0% /dev/shm

So there is definitely space left! I’ve just noticed an email from Cron Daemon:


Stopping httpd: [  OK  ]
Starting httpd: (98)Address already in use: make_sock: could not bind to address
[::]:80
(98)Address already in use: make_sock: could not bind to address 0.0.0.0:80
no listening sockets available, shutting down
Unable to open logs
[FAILED]
error: error running postrotate script for /home/*****1/logs/access_log /home/******1/logs/error_log
Stopping httpd: [  OK  ]
Starting httpd: [  OK  ]
Stopping httpd: [  OK  ]
Starting httpd: [  OK  ]
Stopping httpd: [  OK  ]
Starting httpd: [  OK  ]
Stopping httpd: [  OK  ]
Starting httpd: [  OK  ]
Stopping httpd: [  OK  ]
Starting httpd: (98)Address already in use: make_sock: could not bind to address
[::]:80
(98)Address already in use: make_sock: could not bind to address 0.0.0.0:80
no listening sockets available, shutting down
Unable to open logs
[FAILED]
error: error running postrotate script for /home/******2/domains/*****.com/logs/access_log
/home/*******2/domains/******.com/logs/error_log 
Stopping httpd: [  OK  ]
Starting httpd: [  OK  ]
Stopping httpd: [  OK  ]
Starting httpd: [  OK  ]
Stopping httpd: [  OK  ]
Starting httpd: [  OK  ]
Stopping httpd: [FAILED]
Starting httpd: [FAILED]
error: error running postrotate script for /home/*****/logs/access_log /home/*****/logs/error_log
Stopping httpd: [FAILED]
Starting httpd: [FAILED]
error: error running postrotate script for /home/****/logs/access_log /home/****/logs/error_log
Stopping httpd: [FAILED]
Starting httpd: [FAILED]
error: error running postrotate script for /home/*****/logs/access_log /home/****/logs/error_log

so there is something definitely fishy going on here…

Thanks,

Paul

ronald · February 8, 2010, 3:57pm

seems one or more of the users have run out of space and the system cant write to the log files.

pixel_paul · February 8, 2010, 4:02pm

I don’t have any quotas enabled for any users, so not sure how they can run out of space?

ronald · February 8, 2010, 4:13pm

looking gain I see the error is actually: Unable to open logs
and the logs are in the users space.

did the user delete those logs? I had this happening once and that crashed my server too.

Eric · February 8, 2010, 4:16pm

Howdy,

In addition to what Ronald said – I’m curious what output you get from this command:

repquota -v /

pixel_paul · February 8, 2010, 4:38pm

Ronald:
The user has no access to the server - and as far as I know I haven’t deleted them.

Eric:
[root@]# /usr/sbin/repquota -v / repquota: Mountpoint (or device) / not found. repquota: Not all specified mountpoints are using quota.

Thanks,

Paul

ronald · February 8, 2010, 4:49pm

thinking outloud, did you try a killall on the httpd?

pixel_paul · February 8, 2010, 4:54pm

I actually restarted the machine in order to get this working again. Otherwise I hadn’t touched the server other than creating a couple of virtual servers on Friday

ronald · February 8, 2010, 5:25pm

can you see if these log files for those users are linked to /var/log/virtualmin/*
as I remember that this has changed for new users but not existing users.
per haps it is relevant

ken.wiesner · February 15, 2010, 3:15pm

It probably wasn’t a drive space issue. It was probably due to semaphore array issue. This actually happened to my server on EC2 this morning. I found a blog that said to run:


[root@hosting01:/var/log/httpd] ipcs -s | grep apache
0x00000000 82575367   apache    600        1         
0x00000000 50528264   apache    600        1         
0x00000000 53248009   apache    600        1         
0x00000000 59244554   apache    600        1
...truncated...

I then ran the following to clear it out:


[root@hosting01:/var/log/httpd] ipcs -s | grep apache | awk '{print "ipcrm sem " $2}' | sh
resource(s) deleted
resource(s) deleted
resource(s) deleted
...truncated...

I then issued the restart for apache and all was well.

cruiskeen · February 21, 2010, 10:53am

I’m up at the moment nursing my server for this very reason. This is because logrotate is furiously trying to stop and restart httpd during the logrotate, which happens weekly at this time. I note that by default the logrotate set up here does a restart of apache, and when this happens over and over again in quick succession sometimes apache doesn’t finish dying before the next restart, and then the semaphores pile up until you run out of semaphore space.

A couple things I’m trying here include having apache just do a graceful instead of a restart, and increasing the timeout in the httpd start script where it waits for the children to die. Also possibly having it clear the semaphores before starting apache.

pixel_paul · March 8, 2010, 12:33pm

thanks everyone for ideas as to cause - finally had a chance to look at this a bit closer and it seems to be related to the semaphore issue.

cruiskeen: did you find that the changes you made re: graceful/restart or the timeout increase has made a difference?

cruiskeen · March 8, 2010, 3:23pm

Hard to say since it only happened every once in a while in the first place. But it hasn’t happened since I made the changes, for whatever that’s worth.

pixel_paul · March 8, 2010, 4:19pm

I’ll give it a go and see what I find.

Thanks for the heads up on this.

Cheers,

Paul

uwe · July 18, 2010, 9:46pm

Hi,

I am having the same issue, but only on 2 of my 5 virtualmin servers (1 is gpl, the others are customer or my servers commercial license). These two (commercial) die randomly always Sunday morning around 4:00 am, logrotate time. I have all updates in place on all of the servers, only difference to a stock virtualmin is php 5.11 (bleeding edge repo).
I came across this thread by seeking a solution as it is just unhandy to get up at 4:00 in the morning to check if the servers are running
Is there a fix available by now?

Uwe

pixel_paul · July 27, 2010, 3:50pm

I managed to stop this from occuring by changing all post-rotation commands to
/etc/rc.d/init.d/httpd graceful

Since then I havent encountered this issue.

uwe · July 27, 2010, 11:04pm

Hi picel_paul,

thanks for your reply.
I instantly went and checked the settings and what do you know, there is different types of entries automatically generated :

/usr/sbin/apachectl graceful or restart

/etc/rc.d/init.d/httpd restart

The first once where mainly set to “graceful” the second ones all to restart. I changed all restart to graceful and hope for the best.
On my virtualmin GPL server they were all set to graceful, seems to be the reason why that one did not fail.

Question now is, why different command lines? The servers created lately where all in the second form. Is there a difference inbetween these command lines? Which one should I use? And where to change/force it for new servers? I checked the template, but there are no settings for the post-rotation command.

Thanks

Uwe

pixel_paul · July 28, 2010, 8:26am

Hi Uwe,

All I know is that the hard restart with /etc/rc.d/init.d/httpd restart was causing the problem, and changing to graceful solved the problem.

When I create a new virtual server the logrotate post command is /etc/rc.d/init.d/httpd graceful - if you’re getting it set to /etc/rc.d/init.d/httpd restart then this may need to be escalated to someone who knows a bit more…

Cheers,

Paul