There are a few things sort of simmering on this capability. VM2 is going public pretty soon, which gives an over-arching "one server manages many" capability–which is kind of necessary for marshaling resources for failover and other sorts of high availability.
Eric has been itching to work on documentation and processes for backup servers and such, and when the new website launches (meaning he’s finished with a bunch of other docs I’ve got him working on this month), I’ll set him loose on that, and he and I will work together on coming up with requirements for the major dev work that would need to go into it on Jamie’s side. The process of documenting what goes into it will make it more apparent what kinds of things Virtualmin and/or VM2 and/or Webmin can do to make those tasks simpler, more automatic, and more fool-proof.
As I’ve mentioned in several threads on the topic, the really hard work will never be within Virtualmin’s purview. If you want your applications to scale, then your applications have to be designed to scale…and Virtualmin can’t automatically make them scale.
The things we can do, however, include things like MySQL replication, shared data via ZFS or GFS, IP takeover, and possibly load balancing via mod_proxy_balancer–all of which are challenging in their own right, and I know a lot of folks who’ve had a hard time with them. Those are all big projects, however, and they kind of all have to be designed together, or they won’t fit together very well at the end of it all. (This is also a problem. If the infrastructure we design doesn’t fit exactly with the way people want to do things, they won’t use it. We always design a lot of flexibility into our products, but this is an area where the full stack has to be pretty precisely designed.)
Also, I think I need to mention why this has remained a backburner project (beyond being really complicated to implement): You guys are way in the minority among our customers. There are maybe a dozen folks, out of a couple thousand, who have expressed an interest in clustering and failover and such…and none of our large hosting provider customers and potential customers have even mentioned it. You guys, and me and Eric and Jamie think this stuff is cool because it’s really interesting and fun to play with, and we talk about it quite a bit…but, the market isn’t demanding it. I’m afraid we may have to make this a proprietary option that costs extra in order to make it feasible to build and maintain it–would having to spend some extra money make this feature less appealing? And, how much extra money would make it less appealing? Right now, it’s looking like at bare minimum, it’s going to be a Virtualmin+VM2-only feature, so you’ll be buying at least one license of each (Virtualmin can be used on a hot spare at no extra cost–though if you begin doing load balancing, we’ll probably want you to buy another Virtualmin license). VM2 in this particular configuration will probably only be $198 (or free if you have more than five Virtualmin licenses). If an extra plugin, costing maybe another $98 or $198 or even more, were needed, would this be cost-prohibitive? (So, now we’re talking about $296 or $396 or more extra on top of your Virtualmin licenses, in order to perform high availability and load balancing.)
I don’t know if the math will work out, even at the higher price I mentioned…but since we do know that it’s a big project, requiring involvement of a lot of people and a lot of software (including a lot of packages not provided by a stock CentOS install), and the userbase is far more limited than for a lot of other capabilities, I do know we’re going to have to figure out how to make it pay for itself. For some reason it never really occurred to me that we could just say, “Hey guys, if you want it, we’ll build it, but we’re going to want you to pay for it.” We’ve always just thrown all new features into Virtualmin, for free, and assumed that the increased sales would make it worthwhile…but the more I do the math on this one, the more I realize this model is not going to work for clustering and high availability.