Restoring from a full backup of all servers - no errors but page is hanging after hours

itmustbe · July 4, 2017, 6:26pm

Today I decided to migrate from my CentOS 6 box with Virtualmin to a new CentOS box with Virtualmin.

I have full backups stored in my Amazon S3 bucket (a single file of 3.7GB for my entire server with around 10 websites).

I pointed the Restore process to my Amazon S3 bucket with my credentials, selected All Features (but not Settings) for the restore, and didn’t touch the defaults under Other restore options. I then clicked the Show What Will Be Restored button.

For just over 2 hours, I saw a lot of activity on the server (by monitoring the droplet at Digital Ocean): lots of Bandwidth, CPU and disk I/O activity (when there had been none before). All of that dropped off back to the normal after the 2 hours had completed.

However, the Restore page has not refreshed itself, nor given me any further feedback… the little red line at the top is still not all the way to the right, and the little spinning progress icon is still going, even though it’s been just over an hour now since all activity on the server appeared to cease.

When I try logging into Virtualmin using another browser, I don’t see any Virtual Servers yet, so nothing has yet been restored.

This is the first time I’ve ever tried to restore a Virtualmin backup. Should I just wait on that page for some hours further, even though there’s no server activity? Did something go wrong that I don’t know about? What was the two hours of highly elevated bandwidth/CPU/disk activity about after clicking that button, if not restoration?

Joe · July 4, 2017, 10:54pm

Restoring large backups in the UI is tricky; sometimes it can timeout, even with the stuff we have going on to prevent it from timing out. It’s really hard to keep a browser waiting for a long time (there are ways around that with web sockets and the like, but we can’t actually implement those ideas in the current Webmin; we’ll start to tackle it in Webmin 2.0 in a couple of months).

I usually prefer to use the command line for big restores, and I’ll run them in a screen or tmux session so even if my ssh session times out I can resume the session later.

There may be log items that can give clues about the state of things (there’s a webmin.log in /var/webmin which is the activity log). You can also look in /home and in /etc/passwd to see if there’s new stuff there.

I kinda suspect Virtualmin tried to validate the restore, found some problems, but the browser session had timed out by the time it tried to tell you about it. The validation step does much of the hard work of a restore before even beginning the actual restore; it opens up the archives, checks the features enabled, checks on space needed, checks for conflicts with existing users and such on the system, and if it finds problems will report them so you can fix the problems before doing the restore (since doing a restore into a server that’s different than where it came from can cause hard-to-diagnose problems down the line).

So, here’s how I’d proceed:

Check to be sure it’s not still working on the restore (check webmin.log, check to be sure disk usage isn’t changing, new files aren’t appearing in home, new databases aren’t being created, etc.), and if it’s not, try restoring one (preferably small) domain.

That’ll probably reveal why the first restore attempt didn’t do what you expected. There’s probably something about the new system that doesn’t match the old system. There are ways to work around that if you don’t want to make the new system match the old (disable the feature in the restore usually works, and it can be re-enabled with different settings if you still want it, etc.).

Once you’ve restored a couple of smaller domains and have reasonable confidence the new system is suitable for receiving domains from the old, you could then restore the rest in a batch (either CLI or in the browser). We’d like to make it a bit more robust to differences, but sometimes that’s tricky. I mean, if the new system has a feature disabled that you used on the old system and it couldn’t be restored onto the new, that’d potentially be a big deal; e.g. what if you use Mailman mailing lists on the old, and don’t have it on the new (we no longer enable Mailman on fresh installs, because it is so rarely used). They wouldn’t be able to be imported, and so you’d end up having to manually bring them over later, and that’d suck. So, for now, we err on the side of extreme caution when restoring a domain and finding something isn’t available that was on the old system.

midol · July 4, 2017, 11:32pm

" if the new system has a feature disabled that you used on the old system and it couldn’t be restored onto the new, that’d potentially be a big deal; e.g. what if you use Mailman mailing lists on the old, and don’t have it on the new (we no longer enable Mailman on fresh installs, because it is so rarely used). They wouldn’t be able to be imported, and so you’d end up having to manually bring them over later, and that’d suck. "

yes, well I have sites that do nothing but mailman lists and it would suck all right. How do you assess it as rarely used?

Joe · July 5, 2017, 10:38am

Just install the module (if it’s not already installed) and turn it on. The functionality isn’t gone or unmaintained. The Virtualmin Mailman module will stick around probably forever. I think it even handles creating the mailman user for you these days (I might be wrong about that…it used to be in the install process…I should actually add it back in as an option in the config-system command).

We don’t collect detailed usage data for modules so I can’t say what any given module usage looks like with specific detail, but we get a feel for things after 10+ years of supporting Virtualmin on thousands of systems. Mailman usage on Virtualmin systems is extremely low, probably in the single-digit hundreds (out of a little over 100,000 active installations). So, it’s less than 1%, as far as I can tell.

But, we like Mailman and we’ll keep supporting it as long as there are people using it actively and reporting on problems (if any).

itmustbe · July 5, 2017, 1:39pm

Thank you Joe for your detailed comment about what went wrong. I’d kind of wondered about the size of the backup, but thought that it should just work somehow. Since this was not (thankfully) an emergency situation, I simply generated fresh backups from my CentOS 6 box with an individual file for each server, and that worked a treat. I made sure to install additional modules (in my case just newer versions of PHP) that would otherwise have been different between the two servers.

I did have an issue with FTP or with the FTP users (I’m not sure which), but since I just purchased a Pro license, I’ll submit that issue separately in the Issue tracker. Everything else is humming along perfectly on my new CentOS 7 box though, and the Virtualmin Restore process was fast and painless with individual files