Sometimes I notice that my httpd runs at constant 100% cpu (and as a result websites are down / time out). Usually only “service httpd restart” helps.
Does anybody has an idea what causes this problem? And how to prevent it If not preventable, does anyboy have an out of the box solution to to detect this and restart automatically?
I searched the web, but couldn’t find any pointers.
I am on centos55, openvz vps and vm 383
It sounds like a website on your box sometimes uses a lot of cpu.
You need to find out which domain on the box has such faulty script or per haps uses some chatscript.
in any case you can and should limit the use of available resources.
This can be done through virtualmin. This will prevent bringing the server on its knees.
Administration Options - Edit Resource Limits
It can be that your provider has oversold the machine and just too many openvz containers are fighting for some cpu cycles. not an uncommon scenario.
A possible pointer here would be to check the Apache access logs at times when CPU load spikes. That should give an idea which web site is responsible.
A hint about the “Edit resource limits” thing: This only works if you run PHP in FCGI mode, because only then does each web site have its own PHP processes. If using Apache mod_php, all web sites share the same Apache processes, and it’s not possible to limit just one of them.
What you can do in that case though is run the site in question as FCGI and the rest as mod_php, then you can specifically limit that one site.
I don’t think it is an openvz overload problem for the following two reasons
- httpd was 100% for 10 hours
- a “service httpd restart” immediately resolves the issue.
A buggy script is possible. But I am wondering, if a script caused the 100% cpu utilization, wouldn’t the php process have the 100%cpu problem (I run fcgid)? I will look more into this.
I had a closer look at the logs and saw only one suspicious event (which happened a few minutes before I received my first server down message). There was nothing unusual in the access logs.
Wed Feb 09 22:57:50 2011] [error] server reached MaxClients setting, consider raising the MaxClients setting
May be this caused the cpu issue?
Current prefork settings:
Typically there are up to 5-10 concurrent visitors on the site (Joomla)
The “MaxClients reached” could be a result of Apache going 100% CPU, thus not being able to serve client requests quickly enough, thus they queue up in the backlog until that is full and they need to get dropped.
It’d be interesting to see log entries of the time just before and after the 100% CPU started… You might use the
atop package, which can display resource usage and also log it to a file for later playback, hopefully allowing you find out at which point Apache went berserk.
And yeah, since you use FCGId, if the problem was with PHP scripts, you should see CGI processes going 100% CPU and not Apache.
Thanks for the continued interest/support.
I checked the logs, there is absolutely nothing suspicious. Currently I am clueless. Next time the problem happens again I will use strace etc to get a better idea of what causes this.
I just switched from prefork mpm to worker mpm. It seems to work actually very well (although it needs twice the memory with the default settings, but I fortunately have plenty). Of course I will have to wait a few days/weeks to see whether the 100%cpu issue happens again.
Just FYI. haven’t experienced any cpu load issues anymore with mpm-worker. Works pretty well.