You are here

Part12: 2 node cluster configuration. My Study Notes for Red Hat Certificate of Expertise in Clustering and Storage Management Exam (EX436)

2 node cluster configurations:

wo node clusters in Red Hat Enterprise Linux operate in a special mode. Traditionally, fencing requires a property of the cluster called quorum - the minimum set of hosts required to provide service (in some cluster technologies, this is also referred to as a primary component; the terms are synonymous).; In Red Hat Enterprise Linux, the specific quorum algorithm used is called simple-majority quorum meaning a majority of hosts must be online in order for a quorum to be present. This means that in an 8-node cluster, at least 5 must be online in order to provide service, in a 5-node cluster, at least 3 nodes must be online, and so forth. Generally speaking, quorum is a means to prevent a case called a split brain, where two subsets of a cluster operate independently of one another.

In two node clusters, there is no majority when one node is active. Instead, the cman component relies on a special mode called two_node. In two_node mode, both hosts always have quorum, resulting in a limited split brain behavior. The reason that this is a limited split brain case is because all of the components provided by Red Hat's clustering products not only rely on quorum, but also a mechanism called I/O Fencing (some clustering technologies call this STONITH or STOMITH (acronyms for Shoot The Other Node/Machine In The Head)). I/O fencing, or simply fencing, is an active countermeasure taken by a cluster in order to prevent a presumed-dead or misbehaving cluster member from writing data to a piece of critical shared media. The act of cutting off this presumed-dead member prevents data corruption on shared media. Since all of Red Hat's High Availability and Resilient Storage components rely not only on quorum, but also fencing, data integrity is preserved in this limited split brain case.

Now, when a two-node cluster partitions for any reason, both nodes, since they have quorum, enter what is called a fence race. This isn't a bad thing; it just means both are trying to cut each other off in order to establish a new leader so recovery can complete, thereby allowing the cluster to continue providing service to clients. In many cases, this is fine:

Cable pull on a single host
Single fencing device which serializes access

In cases where there are multiple fencing devices (especially integrated power management, such as HP iLO), the outcome can be undesirable. This is because it is possible for both nodes to issue a fencing action at the same time, causing both nodes to turn off. This behavior can cause a cluster outage.

Configure one node's fencedevice with the delay attribute:

NOTE: The node with the fencedevice delay will win fence races. The node without the fencedevice delay will be the one to get fenced in the event of a network split and fence race.

The delay used here should usually be at least 5 seconds or more, to give one node enough time to complete the fencing operation before the other node begins. This may need to be adjusted based on the actual amount of time it takes the fence action to complete.

It is recommended that the "losing" node (the one without a delay on its fence device) be configured to avoid fencing the other node when it boots back up.

We can configure the losing node, turning of the cluster services on start, or by configuring the fence action to be stop, so the server won't reboot again.


The third option, not recommended unless necesary is a tiebreaker, a quorum disk:

There are a number of factors to consider when determining whether or not to include a quorum disk in your
cluster. In most cases, QDisk is unnecessary and can lead to additional configuration complexity, increasing
the likelihood that an incorrect setting might cause unexpected behavior. Red Hat recommends that you only
deploy QDisk if absolutely necessary.

Two-Node Clusters with Separate Networks for Cluster Interconnect (Heartbeat) and
Two-node clusters are inherently susceptible to split-brain scenarios, where each node considers itself the
only remaining member of the cluster when communication between the two is severed. Because the nodes
cannot communicate in order to agree on which node should be removed from the cluster, both will try to
remove the other via fencing. The phenomenon where both hosts attempt to fence each other
simultaneously is referred to as a “fence race.” In a typical environment where fence devices are accessed
on the same network that is used for cluster communication, this is not a problem because the node that has
lost its network connection will be unable to fence the other and will thus lose the race. Some shared fencing

devices serialize access, meaning that only one host can succeed. However, if there is one fencing device
per node and the devices are accessed over a network that is not used for cluster communication, then the
potential exists for both hosts to send the fencing request simultaneously. This results in what is called
“fence death,” where both nodes of the cluster are powered off.
QDisk can deal with fence races in these situations by predetermining which node should remain alive using
either a set of heuristics or an automatic master-wins mechanism. However, in Red Hat Enterprise Linux 5.6
and later and Red Hat Enterprise Linux 6.1 and later, Red Hat recommends using the delay option for
fencing agents instead. Using the delay option with a given fencing agent defines a configuration-based
winner to the fence race and is simpler to configure and use than a quorum disk.
In this configuration, QDisk can also prevent a problem known as “fencing loops,” which only occur in two-
node clusters. Fencing loops occur when a cluster node reboots after being fenced, but cannot rejoin the
cluster because cluster intraconnect is still unavailable. It then fences the surviving node, which then
reboots, and the process repeats indefinitely. An alternative method for preventing fencing loops is to disable
the cluster software at boot using the chkconfig utility.

We are going to first remove centos-cluster3 from the cluster:

[root@centos-clase1 ~]# ccs -h centos-clase1 --rmnode centosclu3hb1
[root@centos-clase1 ~]# cman_tool version -r

[root@centos-clase3 ~]# service rgmanager stop
Stopping Cluster Service Manager: [ OK ]
[root@centos-clase3 ~]# service clvmd stop
Signaling clvmd to exit [ OK ]
clvmd terminated [ OK ]
[root@centos-clase3 ~]# service cman stop
Stopping cluster:
Leaving fence domain... [ OK ]
Stopping gfs_controld... [ OK ]
Stopping dlm_controld... [ OK ]
Stopping fenced... [ OK ]
Stopping cman... [ OK ]
Waiting for corosync to shutdown: [ OK ]
Unloading kernel modules... [ OK ]
Unmounting configfs... [ OK ]

[root@centos-clase2 ~]# clustat
Cluster Status for fomvsclu @ Wed Sep 4 19:35:44 2013
Member Status: Quorate

Member Name ID Status
------ ---- ---- ------
centosclu1hb1 1 Online, rgmanager
centosclu2hb1 2 Online, Local, rgmanager

Service Name Owner (Last) State
------- ---- ----- ------ -----
service:hanfs centosclu2hb1 started
service:sambapublic centosclu2hb1 started
service:wwwprod centosclu2hb1 started

The node is gone and votes are recalculated:

[root@centos-clase2 ~]# cman_tool status
Version: 6.2.0
Config Version: 52
Cluster Name: fomvsclu
Cluster Id: 16787
Cluster Member: Yes
Cluster Generation: 132
Membership state: Cluster-Member
Nodes: 2
Expected votes: 2
Total votes: 2
Node votes: 1
Quorum: 2
Active subsystems: 8
Ports Bound: 0 177
Node name: centosclu2hb1
Node ID: 2
Multicast addresses:
Node addresses:

Now we are going to create and configure thr quorum disk:

[root@centos-clase2 ~]# mkqdisk -c /dev/mapper/quorumdisk -l QUORUMDISK
mkqdisk v3.0.12.1

Writing new quorum disk label 'QUORUMDISK' to /dev/mapper/quorumdisk.
WARNING: About to destroy all data on /dev/mapper/quorumdisk; proceed [N/y] ? y
Initializing status block for node 1...
Initializing status block for node 2...
Initializing status block for node 3...
Initializing status block for node 4...
Initializing status block for node 5...
Initializing status block for node 6...
Initializing status block for node 7...
Initializing status block for node 8...
Initializing status block for node 9...
Initializing status block for node 10...
Initializing status block for node 11...
Initializing status block for node 12...
Initializing status block for node 13...
Initializing status block for node 14...
Initializing status block for node 15...
Initializing status block for node 16...

We can check from another node if we can see the quorum disk:

[root@centos-clase1 ~]# mkqdisk -L
mkqdisk v3.0.12.1

Magic: eb7a62c2
Created: Wed Sep 4 19:55:49 2013
Kernel Sector Size: 512
Recorded Sector Size: 512

We now configure the device, setquorumd gets all the options from quorumd, you can check them out with man qdisk:
[root@centos-clase1 ~]# ccs -h centos-clase1 --setquorumd device="/dev/mapper/quorumdisk" votes="1"
We add heuristics, all options como from man qdisk:

[root@centos-clase1 ~]# ccs -h centos-clase1 --addheuristic program="ping -c1 -w1" score="1"
[root@centos-clase1 ~]# ccs -h centos-clase1 --lsquorum
Quorumd: device=/dev/mapper/quorumdisk, votes=1
heuristic: program=ping -c1 -w1, score=1
[root@centos-clase1 ~]#

We can check the result in cluster.conf:

We don't add tko and intervals, because in rhel 6.1 and newer it calculates it on it's on:

once we have the quorum disk configured, we start qdiskd daemon that is part of the cman pkg in both nodes:

[root@centos-clase1 init.d]# ./cman help
Usage: ./cman {start|stop|restart|reload|force-reload|condrestart|try-restart|status}
[root@centos-clase1 init.d]# ./cman status
qdiskd is stopped
[root@centos-clase1 init.d]# ./cman start
Starting cluster:
Checking if cluster has been disabled at boot... [ OK ]
Checking Network Manager... [ OK ]
Global setup... [ OK ]
Loading kernel modules... [ OK ]
Mounting configfs... [ OK ]
Starting cman... [ OK ]
Starting qdiskd... [ OK ]
Waiting for quorum... [ OK ]
Starting fenced... [ OK ]
Starting dlm_controld... [ OK ]
Tuning DLM kernel config... [ OK ]
Starting gfs_controld... [ OK ]
Unfencing self... [ OK ]
Joining fence domain... [ OK ]
[root@centos-clase1 init.d]# clustat
Cluster Status for fomvsclu @ Wed Sep 4 20:11:45 2013
Member Status: Quorate

Member Name ID Status
------ ---- ---- ------
centosclu1hb1 1 Online, Local, rgmanager
centosclu2hb1 2 Online, rgmanager
/dev/mapper/quorumdisk 0 Online, Quorum Disk ----------> here we have our quorum disk

Service Name Owner (Last) State
------- ---- ----- ------ -----
service:hanfs centosclu2hb1 started
service:sambapublic centosclu2hb1 started
service:wwwprod centosclu2hb1 started

ot@centos-clase1 init.d]# cman_tool status
Membership state: Cluster-Member
Nodes: 2
Expected votes: 2
Quorum device votes: 1
Total votes: 3
Quorum: 2

IMPORTANTE TAMBIEN, APARTIR DE RHEL 6.3, si el nodo sigue respondiendo por red, aunque no actualize el qdiskd a tiempo, el nodo no se reinicia, se mantiene, por lo que apartir de 6.3 no hay porque subir el valor al totem token a mas del doble del tko * interval del quorumd

Unix Systems: 

Add new comment

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
By submitting this form, you accept the Mollom privacy policy.

Fatal error: Class CToolsCssCache contains 1 abstract method and must therefore be declared abstract or implement the remaining methods (DrupalCacheInterface::__construct) in /homepages/37/d228974590/htdocs/sites/all/modules/ctools/includes/ on line 52