Thursday, 16 June 2011


I recently have tested out running Heartbeat (finally, took too long to get to this, but that's another story). This is a cluster resource manager (CRM) which polls nodes in a cluster and brings resources up when a node failure is detected.

It's interesting. I wouldn't call it elegant really, maybe the newer Pacemaker would seem cleaner. But it is simple and at least in testing it is effective especially when combined with DRBD which I posted on earlier. The thing is where DRBD really seems built for top-notch resiliency and flexibility, Heartbeat seems it will work, but it's not obvious that you'll get what you expected - maybe it's just the documentation on DRBD was really well done.

At any rate, there is great documentation on getting Heartbeat up with DRBD both from the CentOS wiki and from DRBD. I used heartbeat with drbd83 in CentOS.

What Heartbeat will do is listen for heartbeats from a peer node in a cluster and if a peer goes down, it will bring up services on the working node. There's a handful of important things about this to keep in mind.

First is the heartbeat - this is just a stand-alone network connection between two nodes so if that connection goes down or the heartbeats get choked out by competing traffic, Heartbeat may well decide you have a node failure. This is not a trivial problem because now that Heartbeat can kill services on an active node, this is potentially an new point of failure. And this is common to many HA configurations including DRBD itself though as we know, it will identify split-brain and gives you some recourse for repairs. So the suggestions here are to use a dedicated connection, preferably a point-to-point connection with a cross-over cable or a serial port - and this is not uncommon for clusters (the point-to-point connection) like in this white paper for Microsoft Storage Server.

Then there is the issue of resource management - when the CRM is managing the resources, the usual OS procedures should not. If Heartbeat is in charge of bringing up MySQL, you shouldn't be starting MySQL from the init scripts when the OS boots. Now the nice thing with DRBD is that it's behaviour is consistent with this paradigm - when DRBD resources start up, they are in "secondary" mode and cannot be accessed by the OS. So if you have a file share protected by DRBD, Samba wouldn't be started by the OS, and likewise, that file system would be unavailable when the OS starts (by default at least). So here, Heartbeat makes a lot of sense. You take a 2 node cluster for example, when the nodes start up, Heartbeat looks for the peer, picks someone to become active, and then would make "primary" the DRBD resource on that peer, mount the file system, start smb. On the stand-by node, you would both have 'smb' off and the file system would not be writeable which helps ensure consistency.

I guess I could go on about Heartbeat quite a bit, but there's one last thing to mention specifically here and that's the style of cluster. There are "R1" style clusters which are simple but limited to 2 resources (and other limitations) and then there are CRM enabled clusters which are more robust but more complicated to configure. I have only used R1 because it was sufficient for my needs - 2 nodes, one was known "preferred", keeping cluster configuration in sync "manually" wasn't onerous. But CRM enabled clusters are more interesting because you can add more nodes and it will distribute the cluster configuration automatically, etc.

The one thing I haven't really touched on is the quorum which others who are more familiar with cluster management will be more familiar with than I. Basically with Heartbeat in an R1 style cluster, there isn't going to be a quorum. Your configuration is maintained pretty much manually, services are only running on one node, etc. In CRM style Heartbeat or other application clusters, the quorum is what all the nodes agree on and typically stored in a file. On Windows Storage Server and other clusters, the quorum is stored on the shared disk which means any problem there means the cluster fails. With Heartbeat, the quorum file is copied among the nodes but this is susceptible to becoming out of sync like if there is a communication failure on the heartbeat channel leading to a split brain. Or this is my limited understanding of this. At any rate, it is a problem and it isn't trivial in working with active/active or multi-node configurations.

Popular Posts