This is a new thread following on from my previous thread (multiple WP instances) - as the cause of the problem seems quite different to what I originally thought. I have spent quite a lot of time troubleshooting. The server is currently down and there are some expectations around getting it up . I am holding off rebooting, just to see if I can find a root cause.
If I canât find a root cause in time I will need to reboot, and consider re-building the server, which would be quite the hassle but may be necessary.
Here is what I have found so far
The problem is intermittant, the server can be running fine and for no reason I can see it just loses all connectivity
I cannot ping the server and I cannot connect via SSH. Attempting to connect via SSH just hangs
I can connect to the server via EC2 serial connect, the server seems to be running fine, no excessive load
On the server I cannot ping 8.8.8.8
The server is assigned an internal ip addreess and this also cannot be pinged
On a working EC2 instance the above can be pinged so I am thinking the above is good evidence of a fundamental internal networking problem
ip a returns the below, there does not seem to be any ipv4 address?
I donât think I see any IPv6 address, either. Anything in fe80 is a link-local address. AFAIK there will always be an address like that on an IPv6 capable interface, even if bringing up network fails (via DHCP or whatever).
But, yeah, your problem is network is gone. Which is a weird problem. You generally shouldnât touch the network configuration on a cloud virtual machine, because there is nothing useful you can do, you canât change how routing works and your host is only going to route traffic from and to the address they gave you. Literally any change you make can only break something.
If you tried to use Virtualminâs IP assignment features or Webminâs network configuration tools or any other network configuration tool, that was probably a mistake, because Amazon wonât let you do anything that isnât already configured on the VM.
So, do you know how networking is managed on this system and if youâve altered configuration? (There are a half dozen network configuration systems in common use on Linux. Ubuntu has netplan, NetworkManager, systemd-networkd, the old Debian-style interfaces shells scripts, and cloud-init. Netplan and cloud-init can drive at least some of those other mechanisms. Fiddling with the wrong one could also break networking.)
Changing the hostname in Webmin also might count as âusing Webminâs network configuration toolsâ, if Webmin isnât correctly configured for the network system being used and especially if you ever pressed âApply Configurationâ. Thereâs a reason thereâs a warning next to that button that says, " Warning - this may make your system inaccessible via the network, and cut off access to Webmin", though I donât think thatâs emphatic or clear enough. It dates back to a time before there were so many competing ways to configure networking, and Webmin just knew which system was in use. Now, itâs difficult to figure out how things are configured, since the mere existence of, e.g. /etc/interfaces may not mean the old Debian-style scripts are actually being used or the source of truth. And, even back then it was still a risky maneuver to press that button.
Thanks. So threeâs the root cause I think ⊠and thinking back to previous times it happened I actually think the server was running fine as it is now, I was just making the (incorrect) assumption it âcrashedâ.
Re config I did the webmin / virtualmin install on a fresh EC2 Ubuntu24 instance. It worked fine other than right at the end it gives an error to do with hostname and asks for a fqdn which I give it and then it works fine - happy days. I have done 6 or 7 installs in this way it is always the same
Once installed I donât touch network settings either in AWS or on the box or in VirtualMin. I just donât know enough to even go there.
Re apply configuration I donât know where to find that button so I donât think I will have pressed it?? If you can let me know where to find it Iâll check to see if it is familarâŠ
So I am thinking I will need to rebuild the server and I may simplify it down to just virtualmin and enabling wordpress. (it has some nodejs processes and python processes on it, neither of which do anything at the network level but just to eliminate variables) Iâll post the domain error message I get when I install virtual min I, although like I say I think its more informational from what I can tell.