I have 2 Virtualmin servers on VPS @ ServerCheap & Namecheap.
Both are running Debian 12. Both are fully auto-updated everything.
I can’t tell you exactly when (I’ve been in the hospital the last few days) but something happened where I have the exact same problem with both of them. I can’t log in.
Try to log into Virtualmin and the loading dial just spins and spins.
Try to log in via SSH and no connection can ever be established.
What’s the statistical probability of a hardware error happening within a couple of days on two different servers managed by different operators?
Ports? VM & SSH run on different ports?
I’m kinda completely flummoxed.
I’m not at all worried about rebuilding the servers (Virtualmin rocks!) as everything is double backed up (I see the backups in my S3 buckets.) I’m fairly confident this has an absolutely nothing to do with Virtualmin except… one of the servers is running a mailbox and it cannot be reached.
Anyone have anything on this? Any Debian updates in the last several days which may have caused this? All websites running on the servers are fully functional. All Docker containers are running as well.
It can’t be hardware if there’s services that are running fine.
There are a few ways to get blocked, temporarily, by a firewall that could lead to connections failing. Fail2ban, as well as the ssh and Webmin brute force protection features (the latter wouldn’t be a connection failure, though…it’d be a denied message). Try coming in from a different source IP (e.g. tether through your phone or go visit a coffee shop with WiFi).
I assume you’ve tried rebooting one of them to see if things come back to normal? It’s possible the OOM killer has killed ssh and Webmin. Since they weren’t active and your websites are, they’d be reasonably likely to be the thing that gets killed (though I think ssh is probably protected…systemd allows prioritizing/protecting services from being killed by the OOM killer, but I don’t actually understand it very well).
Most VM providers offer a way to log into a virtual console. That’s what I’d try next. Then you can poke around in the logs and dmesg to see what’s going on, check the status of ssh and webmin services and their logs to see if they’re running, and check what IPs fail2ban has blocked.
Naturally, before writing here I submitted tickets to both providers. Name cheap got back to me first telling me it was a firewall issue. (Still waiting to hear back from server cheap.)
But I have been able to confirm it is in fact a firewall issue. I got off the hospital’s wifi and switched to the hotspot on my phone. Everything is as it was. (And Joe was correct.)
Now I just have to figure out how the hospital’s address got blocked.
I work on hospital networks sometimes at my day job, and in my experience they block everything by default; it takes a lot of negotiating to get access to anything beyond web ports. So, I wouldn’t expect it to be the hospital being blocked at the provider, but rather the hospital network only allowing web traffic. To put it another way: You should not expect to use ssh or port 10000 from a hospital WiFi network.