httpd / apache 100% cpu load

Hello,

I run both Virtualmin Pro and GPL on CentOS 5.5 on two servers. I have had a problem since February, that occurs on Sundays - but at different clock hours. The problem is that httpd just goes up to 100% load, and is unable to serve any pages. I have tried to track this down but with no success (I think).

I believe that it is caused by logrotate, but I am far from sure. The apache logs have given me nothing to work with, except when it stopped writing to them.

I have many generated entries like this in the logrotate.conf.

rotate 5 weekly compress postrotate /etc/rc.d/init.d/httpd restart ; sleep 5 endscript

Is that correct for the restart?

One of the servers survived this Sunday, after I forced log rotation on Thursday. So I wouldn’t be stunned if it goes down on Thursday, but I have also chown the logs to root:apache. Apache gave an error when trying to service httpd restart that it was “Unable to open log files”. So every time I have had to kill it and then start it. I caught the server that locked up, this Sunday, it did twice and tried to strace it.

But it just gave me a looping:

poll([{fd=271, events=POLLIN}], 1, 3000) = 1 ([{fd=271, revents=POLLHUP}]) read(271, "", 12840) = 0

Any tips or suggestions on what to do is very much appreciated.

If it’s related to the logrotate, you could always try changing the “restart” in logrotate.conf to use either “reload” or “graceful” – both of which should work with the log rotate, but may play a bit nicer.

I might start with “graceful”, and move to “reload” if that doesn’t work.

However, I’ll offer that all the logrotate does is stop and restart Apache. While I have issues arise there due to the large number of stops and restarts, that generally doesn’t cause an Apache process to start running with 100% CPU.

-Eric

But io wait could cause that behavior. Maybe the io wait on your system is very bad. You could check with sysstat and sar.

andreychek: I completely agree that it shouldn’t cause Apache to run at 100%. That’s why I’m so clueless. I will try your suggestions.

helpmin: Thanks! iowait has gone up during night hours, because of backups etc. I suppose, to as much as 25. Daytime it is < 1. I’ve seen some other interesting things though, that 2:10PM user load has gone up to 10 and then down to 2-3, every day except lockup days. Where it of course continued to have higher load. Regardless if it’s iowait or not, I now have a little more to work with.

I have still not gotten any where with this. It happens some time between 9:30AM and 10:00AM, every Sunday.
So I have attached the cron log for around this time. But I find nothing suspicious to my untrained eye, occurring at that time.

Any suggestions on what else I can do?

I’m unfortunately not really sure what might be causing that… I don’t see anything in your cron logs that appears out of the ordinary.

It’d be interesting to see what processes were running at the point when Apache first begins using all that CPU. That may offer some insight into what’s going on.

You could setup “ps auxw” to be run every few minutes on Sunday mornings, and then later review that to see what else was running when Apache started misbehaving.

-Eric

Another idea to track this down might be using atop. It is an advanced version of top which also includes a service that continuously writes performance data to a file which can be analyzed subsequently

You could try to switch to Apache mpm worker mode? This solved my problem which was similar.

By the way, is your server a VPS?