Load Balanced Cluster

fatbox · September 28, 2009, 7:12pm

Hi all,

I know the topic of load balancing and clustering has come up here before but I’d like to revisit it and get some input (especially from Joe & Jamie) on the challenges I’m currently facing.

Like others I have /home, /etc/apache/sites-{enabled,available} and some others shared amongst all the nodes in the cluster. Virtualmin will be accessed on only one of the nodes in the cluster (admin node), if it goes down you lose admin capabilities but not a large impact. All of the nodes in the cluster run webmin and the webmin cluster module is setup on the admin node so that users & groups can be synchronized.

I use LVS/ldirectord for load balancing via the Direct Routing method so each of the machines has a /32 configured as a loopback alias for the load balanced IP.

Here are the challenges I’m currently facing.

When a new virtualserver is created the users and groups need to be synchronized through the cluster user and groups module. None of my customers get permissions other than Server Owner or Extra Admin so they do not have the permissions to do this sync meaning we have to be involved in the creation of virtualservers for clusters. It would be really nice if Virtualmin could be told which cluster hosts to synchronize the new user/group to upon creation and add it as another step in the creation process.

When ever you change the Apache config and apply the changes the same scenario as the user and groups comes into play and Apache needs to be manually restarted on each cluster node. I’m handling this right by having a script run every minute from cron that compares the apache config to a previous copy, if it’s different apache is restarted and the previous copy is updated. It works but obviously not ideal.

I still have yet to come up with a strategy for Apache logs. The current client I’m converting to Virtualmin from no control panel uses Google Analytics for all their statistics so it’s not a huge deal but I’d like to eventually get to a point where the cluster consolidates logs and uses Awstats cluster reporting ability (logresolvemerge.pl).

Any and all feedback is welcomed.

websight · January 21, 2010, 3:45pm

Did you make any progress with these problems? I am currently facing the same situation and am working on a solution. But do it yourself, if something has been cooked up already

fatbox · January 21, 2010, 4:17pm

Yes, we made lots of progress.

User & Group synchronization is really more of a convenience thing for admins with multiple servers, not a production method of user management amongst multiple machines. LDAP is the solution.

The script to restart apache on cluster nodes is included below. It runs every minute on each slave.

For Apache logs I really dislike logrotate - cronolog is a far better solution. However, you cannot cronolog to the same file from multiple hosts in the cluster when the log dir is on an NFS share. So I wrote a little script that wraps cronolog and adds support for an extra token. Then it’s just logresolvemerge.pl and awstats.pl being run on the logs.

Script to check apache config on cluster slaves:
#!/bin/bash

FatBox Inc. - Check apache config on cluster hosts and reload apache if necessary

$Id: cluster-check-apache.sh 38 2009-10-14 21:05:16Z evan $

apache2ctl=/usr/sbin/apache2ctl
tmpdir=/tmp/.cluster-check-apache
restart=0
debug=0

debug() {
[ $debug -eq 1 ] && echo $1
}

[ ! -d $tmpdir ] && mkdir $tmpdir

check current files for new changes

for file in /etc/apache2/sites-enabled/.conf; do
[ “$file” = "/etc/apache2/sites-enabled/.conf" ] && continue

base=`basename $file`
last=$tmpdir/$base
if [ -f $last ]; then
	diff=`diff $file $last`
	if [ ! -z "$diff" ]; then
		debug "$file differs from $last"
		restart=1
	fi
else
	debug "$last doesn't exist"
	restart=1
fi

cp $file $last

done

check last files for removals

for file in $tmpdir/.conf; do
[ “$file” = "$tmpdir/.conf" ] && continue

base=`basename $file`
orig=/etc/apache2/sites-enabled/$base

if [ ! -f $orig ]; then
	restart=1
	rm -f $file
fi

done

if [ $restart -eq 1 ]; then
debug “restarting”
$apache2ctl graceful >/dev/null 2>&1
fi

Script that wraps cronolog so you can include the hostname:
#!/bin/bash

FatBox Inc. - Simple wrapper for cronolog that also understands a new flag: %HOSTNAME

$Id: cronolog-wrapper.sh 29 2009-10-02 17:48:22Z evan $

cronolog=/usr/bin/cronolog
hostname=hostname
args=echo $* | sed -e "s/%HOSTNAME/$hostname/g"

now, replace ourselves with cronolog

exec $cronolog $args

SoftDux · March 12, 2010, 11:50am

@fatbox,

I have some questions for you about this setup:

When a client signs up, or you create a new client / user, do you simply create him on the master node, or do you need todo anything on the slave nodes as well?
When a client upload files via FTP, does it automatically get synchronized to all servers, or do you need to manually sync the file(s) to the other servers?
When email accounts (i.e. email address, password & quota) gets created by the user, does that email account automatically get synchronized to all servers as well?
How do you deal with email (i.e. delivery, bounces, IMAP mailboxes, etc) when the main node is offline?
Are MySQL datbases automatically created on all nodes, or not?

fatbox · March 12, 2010, 4:14pm

LDAP takes care of all of that. We bind customers to a specific DN in the system ldap.conf so that different servers can’t see user entries of other customers.
There is no synchronization, /home is mounted as an NFS share from a highly available disk cluster
Yes, same as #1. Postfix is configured to use LDAP to find accounts so once a user is added through virtualmin any server in the cluster can receive mail for that user (since they all have the same /home).
The ONLY thing you can’t do if the main node is offline is use virtualmin, email still gets delivered (all hosts in the cluster are added as entries to the zone’s MX records).
No, we don’t support local MySQL accounts in our clusters. It’s just WAY to hard to reliable deploy. If we setup a cluster the customer must use our, separate, MySQL cluster. It’s better that way since we most often deploy clusters for performance reasons, offloading the SQL to a dedicated cluster that’s been built for HA MySQL means the performance is leaps and bounds above what it would be running locally.

LPCA · December 15, 2010, 7:03am

yes, LDAP is the way to go for users and groups. the method will likely involve PAM.
or you can use MySQL or PostgreSQL for that. again with PAM

nowadays even Webmin can make use of LDAP, MySQL or PostgreSQL to store it’s data (users/groups, perhaps more)

btw, LDAP can be configured to use RDBMS engine, like MySQL or PostgreSQL, instead of flat files or Berkeley DB (DBD), which has the potential of improving performance.

nibb · November 17, 2012, 2:18pm

Hi, I know this topic is old but I wanted to know if this information about users groups is still valid today.

Leaving besides any hearbeat, load balancing, shared san, dual switches, and every fancy equipment you require for a HA setup my question is more pertaining only the software Webmin itself.

If you cluster webmin servers I see you can sync users and usergroups, even files which could work maybe for every simple configuration files (not data which would be in a shared storage)

It seems both mentioned LDAP is the way to go. Do I assume that the sync user/group in Webmin is not up to the task so LDAP is the way to go, or thats just because this topic is from 2010 and today you can just use webmin for this?

From what I understand, maybe this features where added later, and you could code or trigger some auto sync for this. Same for apache. Sync the apache config file to the cluster, using the RPC protocol and restart it.

If not, please comment why LDAP should be used instead of the build in clustering features in webmin.

Regards