mod_fcgid: read data timeout in 31 seconds

austinfwd · October 18, 2011, 8:39pm

I have a fairly large SQL table, which one of our PHP apps does various queries against. We are having a 31-second timeout occurring which then produces an error message for the end-user & the page w/ the data never displays. The apache error log for the specific virtualmin website states:

[warn] mod_fcgid: read data timeout in 31 seconds

I’ve researched the issue a little bit & some people suggested modifying the following apache conf parameters:

IPCConnectTimeout
IPCCommTimeout

I found that on my debian 5 Virtualmin GPL system, I have the following config file:
/etc/apache2/mods-enabled/fcgid.conf

It looked like this:

AddHandler fcgid-script .fcgi
IPCConnectTimeout 40

I changed it to:

AddHandler fcgid-script .fcgi
IPCConnectTimeout 60
IPCCommTimeout 300

I restarted apache, but the errors persist. I also changed the following PHP-related timeout settings:
max_execution_time
max_input_time
default_socket_timeout
mysql.connect_timeout

I set each of these to either 60 or 120 seconds.

I changed these parameters in pretty much every single php.ini file I could find that I thought could possibly come into play (every PHP conf file under /etc/php5, the one in /home//etc/php5, etc).

Restarted apache. Still the error is the same:
mod_fcgid: read data timeout in 31 seconds

Watching the clock confirms that this error is being produced right at 31 seconds.

I know some people have suggested that this may be a PHP issue and NOT a fcgi issue. But it seems to me that if that were the case, we’d be seeing a PHP error message, and not fcgi message?? Or at least an error message that seemed more descriptive of it being a PHP timeout parameter being exceeded?

Some people suggested changing the virtualmin website’s PHP execution mode to CGI. I did so, but doing so created strange results. I wound up with no error message, but also it seemed that the script just never stopped executing.

So I’m looking for help with this issue.

Just to give more insight into what our PHP app does, the function we’re trying to use is actually just to dump one of our mysql tables via PHP. What we’re finding is that our PHP succeeds if we we limit the query with a WHERE statement to only return a subset of the records. But removing the WHERE clause results in the script hitting 100% CPU usage for 1 core & staying that way until the 31-second fcgi timeout error occurs.

When we query the subset, we also see the 1 CPU core pegged out, but only for 10-15 seconds, after which the expected results are returned.

The primary solution I’d hope to see is simply to increase this timeout value to 60 or 120 seconds.

Anybody know how I can accomplish this?

Thanks a lot!

Doug

Eric · October 18, 2011, 8:45pm

Howdy,

Well, you can certainly tweak various timeouts.

The easiest way to do that is in Server Configuration -> Website Options, and set “Maximum PHP script run time”.

Setting that will change both the FCGID timeouts as well as the timeouts in the php.ini file.

However, the fact that switching to CGI has your script running indefinitely suggests to me that the script or database may be at fault – I don’t think CGI itself is the problem, I think CGI is simply allowing the script to keep on going.

How large is this particular MySQL table? That is, how many records are in it? If it’s a particularly large table, that could cause the problem you’re seeing.

Also, do you see similar results when using mysqldump to dump the table? Does it take just as long and similar amounts of CPU?

-Eric

austinfwd · October 18, 2011, 9:04pm

I hesitated to even make mention of the behavior I saw w/ CGI mode, as I wanted to first make sure that we covered the possibility that fcgi is the primary cause. However, you make some good points, and I should check those out. You pretty much gave me insight for both potential causes, so thanks for that.

I’m really not sure on dumping the table via a different means. I’ll test & see what happens.

The table has 28 columns, the majority of which are some length varchar.

It has 11840 rows.

austinfwd · October 18, 2011, 9:11pm

For kicks, I went to the table in Webmin > Servers > MySQL Database Server. I exported all rows & columns & specified it to display the exported data as CSV in the web browser. It produced these results in probably 20-30 seconds, with a much lower CPU load than what I see via the PHP app we’re using.

Just for the sake of sanity, I selected all of the results & pasted in MS Excel & there are indeed 11840 rows.

So I’m thinking the database & mysql are fine…?

Out of curiosity I checked the mysql logs in /var/log & they had pretty much no data. I’m not sure whether they should. But it seems like when I’ve looked at those logs in the past on other systems, they were pretty minimal (or blank).

I’ll test changing the fcgi timeout using the method you suggested & will report my results.

Eric · October 18, 2011, 9:14pm

I think what you’re seeing is a script that’s running too long – so you’re bumping up against the limit in FCGID – but CGI doesn’t have any hard limits the way FCGID does.

As a test, you could always use mysqldump… something like this would dump that one table, you could run this from the command line:

mysqldump --user=DB_USERNAME --password=DB_PASS DATABASE_NAME TABLE_NAME > my_table.sql

austinfwd · October 18, 2011, 9:24pm

So your suggestion to change the timeout settings in virtualmin via Server Configuration > Website Options, did give me what I was initially asking for. It allowed me to increase the timeout. However, doing so seems to just be delaying the inevitable. I set it to 60 seconds, and then I didn’t get the error until 60 seconds. I then set it to 120, and as a result it happened after 12 seconds.

So you are probably right on normal CGI just letting it run forever. I’ll try to dump the table from the command-line, and I’ll also try to change the PHP script execution mode to each of the others just to test.

I’ll report back my findings.

Just as a side note, I sort of felt like an idiot when I saw that the timeout setting I was looking for was at Server Configuration > Website Options, as that’s where I went last night to change the execution mode from fcgi to CGI. Somehow I totally missed the 30-second timeout setting there… Oh well.

austinfwd · October 18, 2011, 9:30pm

Wow that was amazingly fast, and is a good testament to how much latency is introduced by apache & php doing all of their fancy magic & processing (and rendering by the web browser). Dumping the table via the command-line as you suggested literally took less than 1 second. And I verified that it did indeed export the entire table.

By the way, the exported table is only 3.1MB on the filesystem.

I’ll now try the other PHP execution modes & see what happens.

austinfwd · October 18, 2011, 9:40pm

I set the PHP execution mode to apache mod_php (run as apache’s user). Then I tried our PHP app again to dump the table. At first I thought I was back in the “runs forever, but never produces results boat”. But after probably 3-5 minutes, it actually completed & the web page produced the report it was supposed to. So do you think that I just have a really slow server or some inefficient PHP code?

Seems strange that:

dumping the table from the command line takes 1 second.
Dumping as CSV to client browser in Webmin > Servers > MySQL module takes 20-3o seconds

but my PHP app which simply dumps the table as a CSV & then redirects to a download page to enable the end-user to download the CSV file takes 3-5 minutes??

What do you think?

austinfwd · October 18, 2011, 9:48pm

I set the PHP script execution back to FCGI, and the timeout 300 seconds (5 minutes). The script completed in just over 4.5 minutes.

So I guess the problem is solved by changing the FCGI timeout setting as I was initially hoping. However, you have helped me to see that the delay probably isn’t caused by the mysql server, as it is able to dump that table quickly, and webmin mysql module is able to query & produce a CSV quickly. Seems to me that it must be the PHP code causing the slowness.

Thoughts??

And thanks a MILLION for your help!!

Doug

Eric · October 18, 2011, 10:12pm

Well, the delay you’re seeing seems surprising… the difference between Apache+PHP and mysqldump shouldn’t be minutes.

So my guess is that something else is going on to cause that discrepancy – mysqldump may be faster, but I’d be surprised if that difference had to be “minutes”, and not just seconds.

But, that’s just a hunch

-Eric

austinfwd · October 18, 2011, 10:14pm

Yeah. I’m going to talk to the developer of the PHP app we’re using & refer him to this thread.

austinfwd · October 18, 2011, 10:21pm

OK. I think I need some more help. I really don’t know why this is happening, but I’m now getting the original error again:
Web browser produces: Internal Server Error
And the apache error log produces: [warn] mod_fcgid: read data timeout in 31 seconds

Seems like the timeout increase was short lived…? I checked in virtualmin at Server Configuration > Website Options, and it was still set to execute as the website owner via fcgi with a timeout of 300 seconds. I wondered whether some other setting elsewhere could have overridden it, and perhaps someone else restarted apache (or otherwise reloaded apache’s config), or maybe cron job did (we do have scheduled virtualmin backups firing off throughout the day). So I changed the setting from 300 to 301 seconds. I then tried to run the PHP script again, but I’m still getting the timeout after 31 seconds.

Any thoughts? With this behavior, I’m sort of back to square one, in terms of functionality.

Thanks,

Doug

austinfwd · October 18, 2011, 10:56pm

For kicks, I decided to have webmin dump the table to a CSV again, only save it as a local file this time, rather than displaying on screen. This only took 9 seconds.

I think our next test will be to use PHPMyAdmin. If it is also super fast, then I would have to think that the slowness is probably being caused by the way our PHP app has been coded.

Anyone agree? Disagree?

I would still like to know why the 30 second fcgi timeout has returned.

Thanks,

Doug

Eric · October 18, 2011, 11:24pm

phpMyAdmin is just a PHP app – so it dumping a table is significantly faster there than in your PHP app – it is indeed likely to do with the way the app is coded.

As far as your 30 second timeout – it’s difficult to say why that would happen, but you might want to try restarting Apache just to be super-sure that it’s not just in need of a re-read of it’s config files.

-Eric

austinfwd · October 18, 2011, 11:42pm

I just stopped & then started apache & tried again, and we’re still getting the 30 second timeout

Meanwhile, Server Configuration > Website Options is still set to fcgi w/ a 301 second timeout.

Any other ideas?

Thanks,

Doug

austinfwd · October 23, 2011, 6:19am

I added some more apache config parameters to my virtual host .conf file. It now has:

IPCCommTimeout 301
IdleTimeout 300
BusyTimeout 300
ProcessLifeTime 7200
IPCConnectTimeout 300
MaxRequestLen 100000000

I picked up these additional parameters as suggested here:
http://www.howtoforge.com/forums/showthread.php?t=44427
http://www.moe.co.uk/2010/04/12/mod_fcgid-ap_pass_brigade-failed-and-premature-end-of-script-headers/

And officially documented here:
http://httpd.apache.org/mod_fcgid/mod/mod_fcgid.html

At this point, I’ve found some more interesting info:

It seems that every time that I restart apache2, the PHP script is able to run with fcgid for the full 4.5 minutes that it needs to fully execute. I get the results in the web browser as desired. However, then if I execute the script again immediately after the first initial success, it fails with:
mod_fcgid: read data timeout in 31 seconds

I’m really not sure why it would succeed initially, but then subsequently fail. This appears to be consistent behavior though. Additionally, if I retry to run it again & again, eventually it succeeds again, but then is immediately followed by another failure.

It’s almost like the fcgi has some sort of memory limit for the process, rather than just the current request. And maybe it is spawning a new process (or I’m randomly getting a new fcgi process) some times, which is causing it to occasionally succeed…? I really don’t know here, but I’m just trying to wrap my mind around the problem. If it were a memory-limit, then it seems odd that it would always be stopping at 31 seconds (which is consistently logged in the apache error log EVERY time that the script fails [mod_fcgid: read data timeout in 31 seconds]).

Also, if I’m just using the CGI PHP execution mode, instead of fcgi, it succeeds every time. This makes me think it is not a php-specific issue, but rather something more specific to fcgi (which is where the logged error is coming from anyway).

I sure would appreciate some help here.

Thanks,

Doug

Eric · October 24, 2011, 7:16pm

Can you remind me what distro it is you’re using? And what version of mod_fcgid do you have there?

Thanks!

-Eric

austinfwd · October 25, 2011, 7:03pm

I’m running 32-bit Debian 5 (Lenny).

We’re using: libapache2-mod-fcgid version 1:2.2-1+lenny1

Depends on: libc6 (>= 2.7-1), apache2.2-common

Description: an alternative module compat with mod_fastcgi
It is a binary compatibility alternative to Apache module mod_fastcgi.
mod_fcgid has a new process management strategy, which concentrates on reducing
the number of fastcgi server, and kick out the corrupt fastcgi server as soon
as possible.

austinfwd · October 28, 2011, 5:43pm

Hello…? Anybody have any further thoughts to help me out here? Do you think that this doesn’t even have to do w/ Virtualmin per se, and I should be talking to the Debian mod_fcgid package maintaner?

Eric · October 28, 2011, 6:48pm

Sorry, I’m just not sure what to suggest… if it works fine for one request, but not for subsequent requests – you may be running into a bug of some sort.

I doubt you’d get much assistance from the package maintainers on Debian Lenny, since Lenny is going to reach it’s end of life in a few months.

Is upgrading to Debian 6 an option? You’d need to do that within a few months anyhow… perhaps this is a good reason to do that sooner rather than later

You could use CGI in the meantime, then switch back to FCGID once you move to Debian 6.

-Eric