So after going about this for an evening…
And ended up with four parts:
- Fixed reverse dns
- Missing perl plugin
- xhr-info taking 11+ seconds (blocking TOTP login)
- TOTP login redirecting to
:10000 after success
Reverse DNS
Was just that, simple enough.
xhr-info taking 11+ seconds (blocking TOTP login)
This was out of my depth but had a helping hand.
Problem: xhr-info taking 11+ seconds (blocking TOTP login)
Root cause chain:
Every xhr-info request (which the browser sends repeatedly during and after login) called theme_list_combined_system_info(). Because post_has('xhr-info') checks %in which includes GET query parameters, nocache was always true for xhr-info requests, bypassing the theme cache entirely. This forced a full system info collection on every single request.
That collection path hit two separate slow operations:
Slow call 1: virtual_server::collect_system_info called get_startstop_links() unconditionally, which ran systemctl show for every service on the system. Each systemctl show call took ~0.9 seconds, and there were multiple invocations, adding up to ~6 seconds total. This happened because authentic-lib.pl called virtual_server::collect_system_info('manual') without checking whether the virtualmin collected cache was still fresh.
Fix: Patched authentic-lib.pl to check the age of the virtualmin collected file before triggering a full virtual_server::collect_system_info. If the file is newer than collect_interval minutes, skip the recollection and use cached data.
Also patched virtual-server/collect-lib.pl to use ||= instead of = when populating startstop, so it won’t overwrite an already-populated value from the cache.
Also disabled refresh_startstop_status() in xhr-lib.pl — this was a redundant live systemd query called before theme_list_combined_system_info on every xhr-info request.
Slow call 2: get_cpu_io_usage calls select(undef, undef, undef, 0.5) for a 0.5-second sleep between two CPU sampling reads. In miniserv’s forked child process, Perl’s select() with no file descriptor sets interacts with miniserv’s master event loop — the master polls via pselect6([5 6 10], timeout=2s) and won’t notice the child has finished until the next 2-second poll cycle. With multiple concurrent requests each triggering this sleep, the master accumulated multiple 2-second poll misses, resulting in ~10 seconds of effective serialization delay even though the actual sleep was only 0.5 seconds.
Workaround: Disabled get_cpu_io_usage collection in system-status-lib.pl (if (0 && defined(...))) — CPU I/O stats are cosmetic and 2fa seems more important to have working. Also disabled collect_pkgs to prevent apt update checks.
Supporting fixes:
- Disabled the boot-time webmincron cron (
scheduled_collect_system_info) which was acquiring the in-memory lock at startup and never releasing it, causing all subsequent collection attempts to hit “Already running”
- Disabled the 5-minute interval webmincron cron for the same reason
- Set
collect_interval=5 and collect_pkgs=0 in /etc/webmin/system-status/config
- Installed
Net::WebSocket::Server (via CPAN + libprotocol-websocket-perl via apt) so stats.pl can start when needed
TOTP login redirecting to :10000 after success
This could only be captured after a successfull login, webmin sent back a redirect to fqdn and port 10000 which was not intercepted by the nginx config for virtualmin.
Adding this to the config resolved it:
proxy_redirect https://[fqdn]:10000/ https://[fqdn]/;
Logins are now sub second or close enough.
Files modified
| File |
Change |
/usr/share/webmin/authentic-theme/authentic-lib.pl |
Skip virtualmin collect_system_info if collected file is fresh |
/usr/share/webmin/authentic-theme/xhr-lib.pl |
Disabled refresh_startstop_status() call |
/usr/share/webmin/virtual-server/collect-lib.pl |
Use ||= for startstop to preserve cached value |
/usr/share/webmin/system-status/system-status-lib.pl |
Disabled get_cpu_io_usage block |
/etc/webmin/system-status/config |
collect_interval=5, collect_pkgs=0 |
/etc/webmin/webmincron/crons/1776370355146080.cron |
Renamed to .off (interval cron disabled) |
/etc/webmin/webmincron/crons/1776370355146081.cron |
Removed boot=1 (boot cron disabled) |
/etc/nginx/sites-available/virtualmin-control |
Added proxy_redirect for [fqdn]:10000 |
Would very much like a review of these changes from some one more knowledgable of virtualmin than me. Haven’t even been working with this system for a month yet.
Also, would this be considered the solution? I’ll wait a day or two for some response before selecting this as a soltuon if nothing else pops up here.