My VPS host had an outage recently and my server was messed up when it came back online.
One of the issues I have been unable to resolve is that certain virtual servers, when selected in the Virtualmin UI, show the details (home directory, mailbox user, etc) of the first sub server instead of the parent.
There is still a functioning mail account and website on the parent server, but I am unable to get to any of the settings. I assume something has got corrupted or the permissions are wrong somewhere since the outage.
It is also not possible to backup the affected top-level virtual servers, they show in the multi-select but when clicking backup, the error is “Backup failed : No domains selected to backup”.
Nothing quite like having good backups to hand. supported by a well documented disaster recovery plan.
you clearly are unable to trace everything that might of happened - no one ever can (the electrons behave in a way that cannot be explained)
Thank you for your replies. That’s a good point, I will check if it lets me backup via cli in the morning.
I take it this sort of behaviour with not being able to load the virtual server settings for the top-level isn’t a common permissions issue or something then?
I do have backups via rsnapshot and will be migrating very soon over to a new Alma machine, but would love to know what’s going on and how to fix it on this one.
I don’t think it is a common thing. If you get sorted you could do a file and folder comparison to see where damage happened, might be interesting and useful.
I’m going to guess some files in /etc/webmin/virtual-server got deleted or corrupted in the “outage”. So, if you have backups, you could compare and restore what’s missing.
Backing up after the problem appears can’t fix problems. You’d just be backing up the data/config that is broken. (Though, doing a safety backup before you start making changes that might make things worse is obviously smart.)
The CLI backup didn’t seem to work, it starts then says “killed” at the third step, copying records in DNS domain.
If I restore a good domain config, after a couple of minutes it seems to revert to the wrong config again.
I uploaded the backup virtual-server/domains/xxxxxxxxxxxxx file for the affected domain and it briefly looked fine on the frontend, but has now changed again to the config for one of the sub-servers. The uid, home directory, etc. is all changed to that of the child server.
There was another file in virtual-server/domains/ that didn’t match any of the domain IDs on the system, this had config for both one child and the affected parent virtual server in it. It seemed to be causing the other config to be overwritten.
I’m not sure where it came from but I’ve deleted it, now the UI is happier and I can get to the parent domain. There was some suspect data in the config (loads of white space characters) so this is probably all down to corruption.
What’s strange is the other data like mail and web page files was all fine, but there were issues with config files. Do they regularly get rewritten so are at higher risk when there is a sudden outage?
I think it’s all sorted. Thanks for the pointer in regards to the virtual server config directory. I hadn’t previously poked around in all that and didn’t know how/where Virtualmin stores everything.
The domain config had a dns_subof= line that was pointing back to its own ID, so I removed that.
The problematic server also had an account with an apostrophe in the email address, i.e “o’reilly”. This was causing an error in quota/linux-lib.pl (unexpected EOF) at line 1013. That might be worth investigating further.