If NGINX Reverse Proxy Isn't Supported... Best Approach for FastCGI?

tbutler · June 19, 2024, 5:47pm

I’m moving from cPanel to Virtualmin. Right now, I have NGINX serving static content with a reverse proxy setup that lets Apache handle dynamic content, including handling spawning FastCGI processes for Perl. I configured the cPanel server to spawn a FastCGI process for scripts with the .fpl extension (letting NGINX pass along legacy non-FastCGI scripts clients might still use to the normal Apache mod_cgi).

If I completely remove Apache as Virtualmin suggests in order to use NGINX, does it have any kind of configured FCGI handling? Most of the code in question skips using frameworks like Plack, instead working directly with FastCGI. Also, what happens to legacy CGI scripts that haven’t been modified to use FastCGI? Are they supported to any degree?

Thanks!

SYSTEM INFORMATION
OS type and version	Debian 12
Webmin version	2.111
Virtualmin version	7.10

ID10T · June 19, 2024, 6:06pm

This comes up from time to time. But honestly, if you are moving from cPanel, the chances of seeing much real world impact of the nginx before Apache setup is probably going to be minimal. Last time it came up I did some searching and could find no actual benchmarks for ANY system using this. Maybe at enterprise level this starts to be a real concern? I simply couldn’t find supporting documentation.

There is one, I believe quasi open source panel, that uses this setup. If you need it, then that might be a good thing to search out.

tbutler · June 19, 2024, 6:13pm

Thanks for your help. Benchmarking out cached, static content, NGINX cut things down a measurable amount over Apache (I got rather obsessed with getting our clients sites served as quickly as possible to make Google’s metrics happy). I’ve found some references to Virtualmin being able to spawn up FastCGI within NGINX directly, which would probably suit our purposes quite well since the really important code is already oriented toward FastCGI. Maybe I just need to give it a spin and see what happens…

ID10T · June 19, 2024, 6:19pm

Well, the GPL version has a quite low cost of ownership.

I’m not sure Hestia was the panel I was thinking about because I remember one being Nginx/Apache as the default. But:
Nginx FastCGI cache support for Nginx + PHP-FPM
Nginx Proxy cache support for Nginx + Apache2

But, other’s with more direct knowledge will probably stumble by. @Stegan is an nginx kinda admin so he might have something to say here.

tbutler · June 19, 2024, 6:22pm

Thank you! I’ll take a gander at Hestia. Hopefully I can get things to my needs here, though – like you said the cost is great, but even with the subscription, I like supporting Open Source development via Virtualmin as opposed to just dumping money in cPanel’s coffers. I’ve been wanting to try moving to Virtualmin for a long time and I really like the general design of it…

Ilia · June 19, 2024, 8:19pm

Hello,

Yes, we set up a simple FCGIWrap server with Nginx to support it, allowing users to run scripts like Perl or Python. It should just work out of the box.

ID10T · June 19, 2024, 9:15pm

Might be interesting to get some results back @tbutler . Assuming since you care enough to do testing you’ll do some again?

Joe · June 19, 2024, 10:27pm

There’s nothing useful to test, IMHO. Web servers are plenty fast. It’s always the apps that are slow. Apache needs a little more memory, but performance is fine…it can serve thousands of requests per second on the tiniest machine you could rent for hosting today. Nobody running Virtualmin is serving thousands of requests per second (that tens or hundreds of millions of requests a day), to hundreds of connected clients, with excellent latency.

If you’re worrying about Apache performance beyond the basics (don’t use/load mod_php, use an efficient MPM rather than prefork, which is another reason to not load mod_php because it forces prefork), you’re spending time on the wrong thing. If you saw a “slow” Apache it was an Apache with mod_php loaded or explicitly configured to use prefork MPM. But, even that isn’t slow enough to matter compared to what any application written in dynamic languages like PHP, JavaScript, Perl, Python, Ruby, etc. can do. Your apps are literally orders of magnitude slower than Apache. Unless you’re only serving static sites and you have crazy traffic with millions of daily visitors, you almost certainly should spend your time optimizing literally everything else before you even start thinking about the web server.

And, if you want to run nginx, just run nginx. It’s silly to run two web servers, and more than negates the one clear benefit of running nginx. The smaller memory footprint of nginx is the one thing about nginx I think is worth seriously considering, if you’re running on modest hardware…performance is irrelevant at the scale we’re all talking about. Running Apache+nginx removes that benefit and means you need more memory than Apache alone. I’m sorry, that’s just an obviously bad idea. Running a proxy also adds another hop of latency! You make your requests to Apache slower by running nginx in front; it’s by a small amount, but if you’re willing to jump through crazy hoops to get a tiny performance boost for static files, it’s silly to hurt performance of your dynamic pages by a similar margin. Not to mention, static assets can be cached at the edge (in the browser, in Cloudflare/Fastly/etc.), dynamic pages cannot. You’re optimizing the exact wrong things if you’re optimizing static assets at the expense of dynamic ones.

In short: The reason to pick nginx when you might prefer the flexibility of Apache is when you’re running on a resource-constrained system (i.e. low memory). Running two web servers on a resource-constrained system is a bad idea. I can’t imagine how that is controversial or something people seriously discuss as some kind of rational option, I honestly don’t know. I won’t be part of a silly waste of resources like that, not to mention the wasted human time on maintaining multiple web servers for no danged reason.

Also, unrelated to nginx or performance, it looks like some of y’all have the impression fcgiwrap is for running fcgi applications. That’s not so. nginx supports fcgi natively, as does Apache. fcgiwrap is for running CGI applications (CGI is a completely different protocol from FCGI), and we now use it on both nginx and Apache because even though Apache has native support for CGI applications, we no longer build a custom Apache that has suexec_docroot set to /home on EL systems. fcgiwrap allows us to run CGI scripts as the domain owner user for both Apache and nginx.

Joe · June 19, 2024, 10:40pm

In case my long rant about how dumb it is to run both Apache and nginx on the same system isn’t concise/clear enough, here’s a summary:

nginx and Apache support fastcgi natively.

Virtualmin uses fcgiwrap to allow both Apache and nginx to run CGI applications.

If you want to use both nginx and Apache on the same system, I don’t want you to use Virtualmin, because I don’t want that support burden, because you’re going out of your way to make everything worse in all dimensions (slower on the requests that you should be worried about being slow, more resource-intensive, and harder to configure and maintain) for no good reason.

We don’t support nginx as a proxy for Apache not because it’s hard for us to do, but because it’s a bad idea and we want to make a good product that makes providing reliable web service easy. It is not a feature, it’s a bug.

tbutler · June 20, 2024, 6:23am

Thanks for all of the input, @Joe. So I went the route I did several years ago, the full reasoning to me is a tad foggy at this point, but I seem to recall two points: in most cases I could shave off significant time running queries through NGINX (maybe dropping a request from say, 120ms to 60ms). But, as a second issue, I tried to go exclusively with NGINX, but the FastCGI documentation is almost exclusively about PHP rather than Perl. (That cPanel actively encourages the reverse proxy configuration probably was a third reason I went that route.)

I spent months ironing out the Perl side of things, too, profiling every module, caching everything I could, looking for slow RegEx, doing manual clean up where Perl wasn’t as memory efficient as I wanted… Web server config sometimes twists my head in knots, but pure Perl optimization is more amenable challenge for me. Probably too obsessed, but in the end pages loaded 10-20x faster — about as close to serving static pages as I think is doable.

Where I got stuck with a pure NGINX setup is something Apache does very well: when a FastCGI-enabled Perl script is launched, it automatically spawns a process for it and leaves it running (for a certain number of requests) using the fcgid-script handler. If I’ve told it to hand “fpl” files as FastCGI, and weatherDesk.fpl is requested and doesn’t presently have a process running, Apache spawns one (or if several requests come in, up to a certain limit, it’ll spawn multiples to take advantage of threading).

Before I reinstall with the LEMP setup — presumably I’ll just wipe and reinstall — does your configuration of NGINX handle spawning FastCGI processes like that? That was the part that went over my head…

tbutler · June 20, 2024, 6:25am

@ID10T, I’m happy to report back the fruits of whatever optimization I end up taking. I’ll head @Joe ’s warnings about the NGINX proxy config and try to see if I can focus on optimizing a different path. The individual core speed of my new server is only slightly faster than the old (just more cores), so it should make for something of an interesting comparison. Stay tuned and I’ll share what I find on that front, once I figure out the question I mentioned above to Joe…

tbutler · June 28, 2024, 3:52am

So I remain stuck. The included FCGIWrap doesn’t allow persistence of FastCGI enabled Perl scripts, so each request for the script requires full initiation, loading modules, connecting to databases, etc. Do you have any suggestions on how to enable something like that ala mod_fcgid on Apache? (This is how I ended up using NGINX as a reverse proxy years ago.)

I’m trying to develop something to accomplish this, but haven’t managed to successfully write an autospawning wrapper NGINX can connect to yet. Here’s my still not functioning code, if you’re curious. Caution, the code is messy – I’m just trying to get something working before I clean it up: GitHub - trbutler/perlFCGI: An attempt to mimic mod_fcgid's autospawning functionality on NGINX. It does not work yet.

Joe · June 28, 2024, 5:18am

You have many options. You can proxy to any app server (using http protocol, something like Starman), or you could use any of several app servers for Perl that provide an FCGI interface. Plack is a sort of swiss army knife of options for interfacing a wide variety of Perl webby things things to a wide variety of other webby things. Plack-1.0051 - Perl Superglue for Web frameworks and Web Servers (PSGI toolkit) - metacpan.org (er, docs are maybe easier to make sense of: Plack - Perl Superglue for Web frameworks and Web Servers (PSGI toolkit) - metacpan.org)

What app framework are you using? There’s almost certainly already some docs about various deployment tactics.

tbutler · June 28, 2024, 2:44pm

My code is a bespoke quasi-framework that’s been developing for two decades. Maybe not what I’d do if I were starting over, but it runs well. At some point, I moved it to FastCGI; I’ve considered moving it to Plack on Starman at times. Two things have kept me from that so far:

It runs really fast as it is and some of the legacy code had issues when I experimented with moving from FCGI to PSGI.
The biggie: the code goes onto each virtual server account. I haven’t been able to figure out how to efficiently use something like Starman and give each account its own access to the code without manually modifying configuring an instance of Starman for each user. On the other hand, mod_fcgid can spin up my FCGI code as needed for each user.

That last one is my big point of confusion Is there an efficient way in a virtual environment like Virtualmin to provide Starman to each user? It seems like I’m defeating what I’m trying to do with hosting automation if I’m manually configuring a bunch of Starman servers alongside folks’ normal accounts.

Thanks so much!

tbutler · June 30, 2024, 7:05am

@ID10T, as promised the start to some benchmarks. So far, here’s what I’ve found: my current NGINX/Reverse Proxy setup, on slower hardware (single Xeon E-2236 vs. dual Xeon 4309Y), dramatically outperforms the default Virtualmin Apache setup by about 7x on static content and the default Virtualmin NGINX setup by 10x . Clearly I have some work cut out for me to figure out what I managed to optimize before that I haven’t yet optimized on this server! There’s a promising twist below, tl;dr: it involves trying to implement the reverse proxy on Virtualmin.

Here is the old config’s results from ab (AlmaLinux 8/WHM/NGINX Reverse Proxy on Apache):

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    2  19.3      0     342
Processing:    53   88  29.8     76     249
Waiting:       53   88  29.8     76     249
Total:         53   90  37.6     76     431

Percentage of the requests served within a certain time (ms)
  50%     76
  66%     87
  75%     96
  80%    106
  90%    140
  95%    159
  98%    179
  99%    261
 100%    431 (longest request)

My vanilla Apache setup on Virtualmin gives these results (Debian 12/Virtualmin/Apache):

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       81  174  56.5    166     452
Processing:   191  608 322.0    524    2448
Waiting:       82  181  70.9    172     777
Total:        277  781 325.8    697    2641

Percentage of the requests served within a certain time (ms)
  50%    697
  66%    771
  75%    831
  80%    875
  90%   1113
  95%   1582
  98%   1734
  99%   1972
 100%   2641 (longest request)

Here’s the Virtualmin NGINX stack:

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    4  36.5      0     386
Processing:   130  899 390.8    830    3739
Waiting:       80  369 180.6    344    1260
Total:        130  903 390.8    835    3739

Percentage of the requests served within a certain time (ms)
  50%    835
  66%    987
  75%   1094
  80%   1183
  90%   1437
  95%   1641
  98%   1934
  99%   2121
 100%   3739 (longest request)

If I install NGINX and configure a reverse proxy on the Virtualmin server (sorry, @Joe , I’m not trying to be ornery, just experimenting to figure out why the server was performing so much worse than my old one; I’m open to suggestions on how to achieve the same results in a way you/Virtualmin recommend), the improvement is so dramatic it takes the Virtualmin server from being 7-10x slower than the old cPanel server to somewhat faster than the cPanel server — this is on the same server that averaged a mean total time of 781 ms above:

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    3  29.6      0     341
Processing:    69   71   1.8     71     181
Waiting:       67   71   1.8     71     181
Total:         69   74  29.7     71     411

Percentage of the requests served within a certain time (ms)
  50%     71
  66%     71
  75%     71
  80%     71
  90%     72
  95%     73
  98%     75
  99%    300
 100%    411 (longest request)

The number of requests it managed per second went up from somewhere between 14-200/s to 1300+/s with the change. Running ab as configured above went from taking about 11 minutes to just seconds.

The only problem is since Virtualmin doesn’t support this configuration out the box, I need to:

Figure out how to achieve similar performance in a supported configuration, since Joe strongly recommended against an NGINX reverse proxy.
Figure out how to script scraping the hosts from the Apache config and updating the NGINX configuration automatically so reverse proxying is cleanly setup on virtual hosts.
Go back to cPanel, but I sure would rather achieve 1 or 2.

Thoughts?

ID10T · June 30, 2024, 4:48pm

If you show a valid use case I doubt the staff will ignore it. Most people here make decisions on what they have read but don’t fully understand the implications of. Mostly they believe in theoretical gains that aren’t applicable to them.

tbutler · June 30, 2024, 5:26pm

That is encouraging! I hope so. I’m trying to see if I can build a little script that could scrape the Apache config and create NGINX configuration files for the interim. If I can get that to trigger any time a domain is added or removed, it might work as an OK stopgap.

A key place this sort of improvement really helps is with XHR requests and other resource loading – while it might not make a huge difference on a single static page, if a page requires 5-10 components (or more) to load, this sort of gain is quite noticeable to the end user.

I’m a tad puzzled why NGINX as a reverse proxy achieves a result so much better than NGINX serving directly, so maybe there’s something I can do to still avoid the reverse proxy arrangement. I would have guessed pure NGINX would be the fastest option, but had opted years ago for the reverse proxy to avoid losing some of the creature comforts of Apache I’ve mentioned above.

I ended up back here when I was researching how to optimize Apache and a lot of recommendations revolve around using Varnish; but Varnish appears to be a reverse proxy without SSL support, so howtos recommend tying it with NGINX to handle SSL in front of Apache. If I were going to do that, I thought I’d might as well see if the NGINX reverse proxy configuration was, somehow, the distinguishing factor between the older, slower server (that was running faster) and the newer, faster server (that was running slower). I’m not really sure why would would attempt two different reverse proxies, after all.

So far while I can’t explain it, I’m very impressed with what a difference the reverse proxy is making, but I’m definitely open to some other approach. I’ll keep digging!

ID10T · June 30, 2024, 11:00pm

I saw a post, I think here, that someone said they were trying to find some time to rewrite the Apache documentation on caching. My back ground is more networking. (CCNP at one point) Server work came as a necessity, not necessarily a love. But, it makes no sense that having nginx in front of Apache should speed up caching unless Apache caching really is bad and/or really just misunderstood.

tbutler · July 1, 2024, 7:17am

Yes, that’s the part that has me mystified. I suspect maybe Apache’s caching really /isn’t/ that great, because there seems to be a lot of recommendations of using either NGINX or Varnish as a reverse proxy for speeding things up. But, I’d love to find a simpler way to achieve anywhere near the same performance.

Actually even more confusing to me is this: serving up the static default home page from Virtualmin, NGINX as a reverse proxy outperforms NGINX as a pure web server, as I shared in those numbers. I could see where NGINX has superior caching to Apache, but why wouldn’t that apply to NGINX serving its own pages?

My memory is foggy, but was NGINX perhaps more of a reverse proxy than a full fledged web server at first? Maybe the reverse proxy/caching side is actually better optimized than the web server portion. I’m not sure. It doesn’t make logical sense.

shoulders · July 1, 2024, 9:38am

I think this used to be the case (in the days of mod_php) and as pointed out you can use NGINX (and possibly Varnish) as a load balancer to give more performance, but this is obviously not exactly the same as NGINX vs Apache.

NGINX is a full webserver. It is one a few web servers offering HTTP/3 (still in testing but available)