Docker High Availability ======================== Considering you are responsible of hosting docker production infrastructures, you are surely interested in improving the service availability. Just imagine you have stacked tens of docker environments on one physical host, you certainly want to ensure that all people accessing your platform won't be prevented from working due to a hardware failure. The first step consists in implementing a manual failover cluster, like described in `Docker relocation `_ tutorial. The second step, shown in this tutorial, will show up how you can followup with a HA configuration by implementing OpenHA on top of OpenSVC services, leading to a dual node high availability docker cluster. OpenSVC agent, and OpenHA are both open source free softwares. Pre-requisites -------------- * OpenSVC services managing docker container(s), in this tutorial, we use only 2 services * first one is registry.opensvc.com * second one is busybox.opensvc.com * 2 physical servers both having : * network attachments on the same IP subnet * shared storage : same block device seen from operating system * OpenSVC agent installed * Docker installed * OpenHA installed (Installation described in `Getting Started - OpenHA Integration `_ and `High Availability Setup `_) * require heartbeat implementation (network, disk, both) * 2 services configured Initial State ------------- We assume that both OpenSVC services are up and running on node ``deb1.opensvc.com`` Service registry.opensvc.com ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Service configuration file:: root@deb1:/etc/opensvc# cat registry.opensvc.com.env [DEFAULT] autostart_node = deb1.opensvc.com app = OSVCLAB service_type = DEV nodes = deb1.opensvc.com deb2.opensvc.com docker_data_dir = /opt/registry.opensvc.com/dockerdata docker_daemon_args = --ip 37.59.71.25 [container#1] type = docker run_image = de9a91835513 run_args = -v /opt/registry.opensvc.com/registrydata:/registrydata -e STORAGE_PATH=/registrydata -p 5000:5000 [ip#1] ipdev = eth0 ipname = registry.opensvc.com [vg#1] vgname = vgregistry [fs#1] mnt_opt = rw mnt = /opt/registry.opensvc.com dev = /dev/mapper/vgregistry-lvregistryroot type = ext4 [fs#2] mnt_opt = rw mnt = /opt/registry.opensvc.com/dockerdata dev = /dev/mapper/vgregistry-lvdockerdata type = ext4 [fs#3] mnt_opt = rw mnt = /opt/registry.opensvc.com/registrydata dev = /dev/mapper/vgregistry-lvregistrydata type = ext4 Service state on node deb1.opensvc.com:: root@deb1:/etc/opensvc# registry.opensvc.com print status registry.opensvc.com overall up |- avail up | |- container#1 .... up e9828ec4620a@registry:latest | |- vg#1 .... up vgregistry | |- fs#1 .... up /dev/mapper/vgregistry-lvregistryroot@/opt/registry.opensvc.com | |- fs#2 .... up /dev/mapper/vgregistry-lvdockerdata@/opt/registry.opensvc.com/dockerdata | |- fs#3 .... up /dev/mapper/vgregistry-lvregistrydata@/opt/registry.opensvc.com/registrydata | '- ip#1 .... up registry.opensvc.com@eth0 |- sync up | '- sync#i0 .... up rsync svc config to drpnodes, nodes '- hb n/a root@deb1:/etc/opensvc# registry.opensvc.com docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES e9828ec4620a registry:latest /bin/sh -c 'exec doc 23 minutes ago Up 23 minutes 37.59.71.25:5000->5000/tcp registry.opensvc.com.container.1 Service state on node deb2.opensvc.com:: root@deb2:/etc/opensvc# registry.opensvc.com print status registry.opensvc.com overall down |- avail down | |- container#1 .... down de9a91835513 | | | # docker daemon is not running | |- vg#1 .... down vgregistry | |- fs#1 .... down /dev/mapper/vgregistry-lvregistryroot@/opt/registry.opensvc.com | |- fs#2 .... down /dev/mapper/vgregistry-lvdockerdata@/opt/registry.opensvc.com/dockerdata | |- fs#3 .... down /dev/mapper/vgregistry-lvregistrydata@/opt/registry.opensvc.com/registrydata | '- ip#1 .... down registry.opensvc.com@eth0 |- sync up | '- sync#i0 .... up rsync svc config to drpnodes, nodes '- hb n/a Service busybox.opensvc.com ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Service configuration file:: root@deb1:/etc/opensvc# cat busybox.opensvc.com.env [DEFAULT] autostart_node = deb1.opensvc.com app = OSVCLAB service_type = DEV nodes = deb1.opensvc.com deb2.opensvc.com docker_data_dir = /opt/busybox.opensvc.com/appdata docker_daemon_args = --ip 37.59.71.24 [container#1] type = docker run_image = b073e328878e subset = 00database [container#2] type = docker run_image = b073e328878e subset = 01appservers [container#3] type = docker run_image = b073e328878e subset = 01appservers [container#4] type = docker run_image = b073e328878e subset = 01appservers [container#5] type = docker run_image = b073e328878e subset = 01appservers [subset#container.docker:01appservers] parallel = true [container#6] type = docker run_image = b073e328878e subset = 02webservers [container#7] type = docker run_image = b073e328878e subset = 02webservers [subset#container.docker:02webservers] parallel = false [ip#1] ipdev = eth0 ipname = busybox.opensvc.com [vg#1] vgname = vgbusybox scsireserv = false [fs#1] mnt_opt = rw mnt = /opt/busybox.opensvc.com dev = /dev/mapper/vgbusybox-lvbusyboxroot type = ext4 [fs#2] mnt_opt = rw mnt = /opt/busybox.opensvc.com/appdata dev = /dev/mapper/vgbusybox-lvbusyboxdata type = ext4 Service state on deb1.opensvc.com:: root@deb1:/etc/opensvc# busybox.opensvc.com print status busybox.opensvc.com overall up |- avail up | |- container#1 .... up c37b83887947@opensvc/busybox:date | |- container#2 .... up a189884f060a@opensvc/busybox:date | |- container#3 .... up fb2513423499@opensvc/busybox:date | |- container#4 .... up 3a7e545bced2@opensvc/busybox:date | |- container#5 .... up 1b6a3d00661e@opensvc/busybox:date | |- container#6 .... up 173a812f4ed5@opensvc/busybox:date | |- container#7 .... up 59fca3cd2745@opensvc/busybox:date | |- vg#1 .... up vgbusybox | |- fs#1 .... up /dev/mapper/vgbusybox-lvbusyboxroot@/opt/busybox.opensvc.com | |- fs#2 .... up /dev/mapper/vgbusybox-lvbusyboxdata@/opt/busybox.opensvc.com/appdata | '- ip#1 .... up busybox.opensvc.com@eth0 |- sync up | '- sync#i0 .... up rsync svc config to drpnodes, nodes '- hb n/a root@deb1:/etc/opensvc# busybox.opensvc.com docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 59fca3cd2745 opensvc/busybox:date /bin/sh -c 'while tr 13 minutes ago Up 13 minutes busybox.opensvc.com.container.7 173a812f4ed5 opensvc/busybox:date /bin/sh -c 'while tr 13 minutes ago Up 13 minutes busybox.opensvc.com.container.6 fb2513423499 opensvc/busybox:date /bin/sh -c 'while tr 13 minutes ago Up 13 minutes busybox.opensvc.com.container.3 3a7e545bced2 opensvc/busybox:date /bin/sh -c 'while tr 13 minutes ago Up 13 minutes busybox.opensvc.com.container.4 a189884f060a opensvc/busybox:date /bin/sh -c 'while tr 13 minutes ago Up 13 minutes busybox.opensvc.com.container.2 1b6a3d00661e opensvc/busybox:date /bin/sh -c 'while tr 13 minutes ago Up 13 minutes busybox.opensvc.com.container.5 c37b83887947 opensvc/busybox:date /bin/sh -c 'while tr 13 minutes ago Up 13 minutes busybox.opensvc.com.container.1 Service state on deb2.opensvc.com:: root@deb2:/etc/opensvc# busybox.opensvc.com print status busybox.opensvc.com overall down |- avail down | |- container#1 .... down b073e328878e | | | # docker daemon is not running | |- container#2 .... down b073e328878e | | | # docker daemon is not running | |- container#3 .... down b073e328878e | | | # docker daemon is not running | |- container#4 .... down b073e328878e | | | # docker daemon is not running | |- container#5 .... down b073e328878e | | | # docker daemon is not running | |- container#6 .... down b073e328878e | | | # docker daemon is not running | |- container#7 .... down b073e328878e | | | # docker daemon is not running | |- vg#1 .... down vgbusybox | |- fs#1 .... down /dev/mapper/vgbusybox-lvbusyboxroot@/opt/busybox.opensvc.com | |- fs#2 .... down /dev/mapper/vgbusybox-lvbusyboxdata@/opt/busybox.opensvc.com/appdata | '- ip#1 .... down busybox.opensvc.com@eth0 |- sync up | '- sync#i0 .... up rsync svc config to drpnodes, nodes '- hb n/a Nodes State ^^^^^^^^^^^ From the physical nodes point of view, the services are in the following states (check overall status column): UP on node deb1.opensvc.com:: root@deb1:/etc/opensvc# svcmon --service=registry.opensvc.com,busybox.opensvc.com service service container container ip disk fs share app hb sync avail overall name type type status status status status status status status status status status frozen ------- ------- --------- --------- ------ ------ ------ ------ ------ ------ ------ ------ ------- ------ registry.opensvc.com DEV hosted up up up up n/a n/a n/a up up up False busybox.opensvc.com DEV hosted up up up up n/a n/a n/a up up up False DOWN on node deb2.opensvc.com:: root@deb2:/etc/opensvc# svcmon --service=registry.opensvc.com,busybox.opensvc.com service service container container ip disk fs share app hb sync avail overall name type type status status status status status status status status status status frozen ------- ------- --------- --------- ------ ------ ------ ------ ------ ------ ------ ------ ------- ------ registry.opensvc.com DEV hosted down down down down n/a n/a n/a up down down False busybox.opensvc.com DEV hosted down down down down n/a n/a n/a up down down False Services registry.opensvc.com and busybox.opensvc.com are currently running on node deb1.opensvc.com. We know that they can be manually relocated on node deb2.opensvc.com, and we need to do it automatically. Cluster Configuration --------------------- One important concept to understand is that the HA cluster is made with 2 different pieces of software * OpenSVC agent : deals with resources management, and service actions, actually acts as a cluster resource manager * this topic is already implemented, described in the previous tutorial part * OpenHA : used to monitor both cluster nodes, and take decisions about where to start/stop/failover which service. * this subject is covered in the following section OpenSVC Configuration ^^^^^^^^^^^^^^^^^^^^^ 2 modifications are expected: 1. remove the autostart_node keyword from service configuration file * OpenSVC is no more deciding where the service start, it is now the OpenHA decision. 2. tell OpenSVC that it now depends on OpenHA First modification implementation:: root@deb1:/# cd /etc/opensvc root@deb1:/etc/opensvc# cp registry.opensvc.com.env registry.opensvc.com.backup root@deb1:/etc/opensvc# cat registry.opensvc.com.backup | grep -v autostart_node > registry.opensvc.com.env root@deb1:/etc/opensvc# diff registry.opensvc.com.backup registry.opensvc.com.env 2d1 < autostart_node = deb1.opensvc.com root@deb1:/etc/opensvc# cp busybox.opensvc.com.env busybox.opensvc.com.backup root@deb1:/etc/opensvc# cat busybox.opensvc.com.backup | grep -v autostart_node > busybox.opensvc.com.env root@deb1:/etc/opensvc# diff busybox.opensvc.com.backup busybox.opensvc.com.env 2d1 < autostart_node = deb1.opensvc.com Second modification implementation:: root@deb1:/etc/opensvc# cat >> registry.opensvc.com.env << EOF [hb#0] type = OpenHA name = registry EOF root@deb1:/etc/opensvc# cat >> busybox.opensvc.com.env << EOF [hb#0] type = OpenHA name = busybox EOF root@deb1:/etc/opensvc# tail -3 registry.opensvc.com.env [hb#0] type = OpenHA name = registry root@deb1:/etc/opensvc# tail -3 busybox.opensvc.com.env [hb#0] type = OpenHA name = busybox Propagate configuration:: root@deb1:/etc/opensvc# allupservices syncnodes --force .. note:: this is a mass action command. Each service in "up" state will propagate its configuration to other nodes. We could have just required the push for our 2 services only. To continue, we stop both services (not mandatory, but easier to understand in this tutorial context) on node deb1.opensvc.com:: root@deb1:/etc/opensvc# registry.opensvc.com stop --cluster root@deb1:/etc/opensvc# busybox.opensvc.com stop --cluster .. warning:: as OpenSVC is no more supposed to pilot services itself, due to the declared heartbeat resource in the .env file, a security prevents it to stop service. It can be overridden by the ``--cluster`` option. OpenHA installation ^^^^^^^^^^^^^^^^^^^ Installation described in `Getting Started - OpenHA Integration `_ and `High Availability Setup `_) :: # wget -O /tmp/openha.latest.rpm http://repo.opensvc.com/rpms/deps/el6/openha-0.4.1.osvc1-0.x86_64.rpm # alien --to-deb /tmp/openha.latest.rpm # dpkg -i openha_0.4.1.osvc1-1_amd64.deb # dpkg -l|grep -i openha ii openha 0.4.1.osvc1-1 amd64 EASY high-availability software. # ls /usr/local/cluster bin conf doc env.sh ezha.init log services OpenHA configuration ^^^^^^^^^^^^^^^^^^^^ In file :file:`/usr/local/cluster/conf/nodes` we list physical cluster nodes. **On both nodes**:: # cat /usr/local/cluster/conf/nodes deb1.opensvc.com deb2.opensvc.com The heartbeat configuration file :file:`/usr/local/cluster/conf/monitor` contains the following lines. .. note:: Whatever the heartbeat type (network, disk), the same logic is used, the first node announce its heartbeat status while the other node reads it. **On both nodes**:: # cat /usr/local/cluster/conf/monitor deb1.opensvc.com net eth1 239.131.50.10 5010 10 deb1.opensvc.com net eth2 239.131.50.11 5011 10 deb2.opensvc.com net eth1 239.131.50.12 5012 10 deb2.opensvc.com net eth2 239.131.50.13 5013 10 deb1.opensvc.com dio /dev/mapper/disk.hb.0 0 10 deb2.opensvc.com dio /dev/mapper/disk.hb.0 10 10 This 3 heartbeats (2 networks + 1 disk) configuration means: - ``deb1.opensvc.com`` node will send heartbeat through eth1 on multicast IP 239.131.50.10 port 5010, with a 10 seconds timeout, and ``deb2.opensvc.com`` node will listen on the same ip port - ``deb1.opensvc.com`` node will send heartbeat through eth2 on multicast IP 239.131.50.11 port 5011, with a 10 seconds timeout, and ``deb2.opensvc.com`` node will listen on the same ip port - ``deb2.opensvc.com`` node will send heartbeat through eth1 on multicast IP 239.131.50.12 port 5012, with a 10 seconds timeout, and ``deb1.opensvc.com`` node will listen on the same ip port - ``deb2.opensvc.com`` node will send heartbeat through eth2 on multicast IP 239.131.50.13 port 5013, with a 10 seconds timeout, and ``deb1.opensvc.com`` node will listen on the same ip port - ``deb1.opensvc.com`` node will write heartbeat starting from the first block of disk ``/dev/mapper/disk.hb.0`` with a 10 seconds timeout, and ``deb2.opensvc.com`` node will read it from the first block - ``deb2.opensvc.com`` node will write heartbeat starting from block number 10 of disk ``/dev/mapper/disk.hb.0`` with a 10 seconds timeout, and ``deb1.opensvc.com`` node will read it from the block number 10 .. note:: adjust heartbeat configuration according to your setup. Only disk, only network, a mix of both. The OpenHA services are configured and mapped 1 to 1 to OpenSVC services: Loading cluster environment, on both nodes:: # export EZ=/usr/local/cluster # . $EZ/env.sh Creating OpenHA service ``registry``, on both nodes:: # $EZ_BIN/service -a registry /etc/opensvc/registry.opensvc.com.cluster deb1.opensvc.com deb2.opensvc.com /bin/true Creating service registry : Directory /usr/local/cluster/services/registry created State file /usr/local/cluster/services/registry/STATE.deb1.opensvc.com created with initial state FROZEN_STOP State file /usr/local/cluster/services/registry/STATE.deb2.opensvc.com created with initial state FROZEN_STOP Done. Creating OpenHA service ``busybox``, on both nodes:: # $EZ_BIN/service -a busybox /etc/opensvc/busybox.opensvc.com.cluster deb1.opensvc.com deb2.opensvc.com /bin/true Creating service busybox : Directory /usr/local/cluster/services/busybox created State file /usr/local/cluster/services/busybox/STATE.deb1.opensvc.com created with initial state FROZEN_STOP State file /usr/local/cluster/services/busybox/STATE.deb2.opensvc.com created with initial state FROZEN_STOP Done. * -a busybox : add an OpenHA service named busybox * /etc/opensvc/busybox.opensvc.com.cluster : is the start/stop script for the service * deb1.opensvc.com : is the primary node * deb2.opensvc.com : is the secondary node * /bin/true : is the check-up script ran before service startup. Should be replaced by a stonith command for production implementation. OpenHA Start ^^^^^^^^^^^^ on node deb1.opensvc.com:: root@deb1:~# $EZ/ezha start root@deb1:~# ez_ps root 25004 0.0 0.1 16664 924 pts/10 S< 07:46 0:00 /usr/local/cluster/bin/heartd eth1 239.131.50.10 5010 root 25007 0.0 0.1 16664 920 pts/10 S< 07:46 0:00 /usr/local/cluster/bin/heartd eth2 239.131.50.11 5011 root 25010 0.0 0.0 16452 500 pts/10 S< 07:46 0:00 /usr/local/cluster/bin/heartc eth1 239.131.50.12 5012 10 root 25013 0.0 0.0 16452 500 pts/10 S< 07:46 0:00 /usr/local/cluster/bin/heartc eth2 239.131.50.13 5013 10 root 25016 0.0 0.1 16452 908 pts/10 S< 07:46 0:00 /usr/local/cluster/bin/heartd_dio /dev/mapper/disk.hb.0 0 root 25019 0.0 0.1 16240 756 pts/10 S< 07:46 0:00 /usr/local/cluster/bin/heartc_dio /dev/mapper/disk.hb.0 10 10 root 25024 0.0 0.2 97928 1052 pts/10 S 07:46 0:00 /usr/local/cluster/bin/nmond root@deb1:~# ez_svc_status => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate registry deb1.opensvc.com FROZEN_STOP deb2.opensvc.com UNKNOWN busybox deb1.opensvc.com FROZEN_STOP deb2.opensvc.com UNKNOWN .. note:: FROZEN_STOP is the expected, default OpenHA service status at first service declaration :: root@deb1:~# ez_hb_status => Querying OpenHA heartbeat status [/usr/local/cluster/bin/hb -s] net sender eth1:239.131.50.10 pid 25004 status UP, updated at Jun 19 08:00:57 net sender eth2:239.131.50.11 pid 25007 status UP, updated at Jun 19 08:00:57 net listener eth1:239.131.50.12 status DOWN net listener eth2:239.131.50.13 status DOWN dio sender /dev/mapper/disk.hb.0:0 pid 25016 status UP, updated at Jun 19 08:00:58 dio listener /dev/mapper/disk.hb.0:10 status DOWN .. note:: although deb1 heartbeat senders are announcing their state accurately, the listeners processes do not manage to read anything announced from deb2, it is a normal state because OpenHA is not started on deb2 at the moment on node deb2.opensvc.com:: root@deb2:/# $EZ/ezha start root@deb2:/# ez_ps root 12407 0.0 0.1 16452 736 pts/0 S< 08:10 0:00 /usr/local/cluster/bin/heartc eth1 239.131.50.10 5010 10 root 12410 0.0 0.1 16452 740 pts/0 S< 08:10 0:00 /usr/local/cluster/bin/heartc eth2 239.131.50.11 5011 10 root 12413 0.0 0.1 16664 920 pts/0 S< 08:10 0:00 /usr/local/cluster/bin/heartd eth1 239.131.50.12 5012 root 12416 0.0 0.1 16664 920 pts/0 S< 08:10 0:00 /usr/local/cluster/bin/heartd eth2 239.131.50.13 5013 root 12419 0.0 0.1 16240 768 pts/0 S< 08:10 0:00 /usr/local/cluster/bin/heartc_dio /dev/mapper/disk.hb.0 0 10 root 12422 0.0 0.1 16452 908 pts/0 S< 08:10 0:00 /usr/local/cluster/bin/heartd_dio /dev/mapper/disk.hb.0 10 root 12426 0.0 0.2 32392 1040 pts/0 S 08:10 0:00 /usr/local/cluster/bin/nmond root@deb2:/# ez_svc_status => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate busybox deb1.opensvc.com FROZEN_STOP deb2.opensvc.com FROZEN_STOP registry deb1.opensvc.com FROZEN_STOP deb2.opensvc.com FROZEN_STOP root@deb2:/# ez_hb_status => Querying OpenHA heartbeat status [/usr/local/cluster/bin/hb -s] net listener eth1:239.131.50.10 pid 25004 status UP, updated at Jun 19 08:10:49 net listener eth2:239.131.50.11 pid 25007 status UP, updated at Jun 19 08:10:49 net sender eth1:239.131.50.12 pid 12413 status UP, updated at Jun 19 08:10:49 net sender eth2:239.131.50.13 pid 12416 status UP, updated at Jun 19 08:10:49 dio listener /dev/mapper/disk.hb.0:0 pid 25016 status UP, updated at Jun 19 08:10:40 dio sender /dev/mapper/disk.hb.0:10 pid 12422 status UP, updated at Jun 19 08:10:49 .. note:: all heartbeat processes are OK, senders and listeners. It is the same on the other node deb1 (you can check) OpenHA Services ^^^^^^^^^^^^^^^ As you have seen, both OpenHA service are in FROZEN_STOP state, which means that they are declared in OpenHA, but their state can't change. They first have to be unfrozen. check OpenSVC state, service should be stopped on both nodes:: root@deb1:/# svcmon --service=registry.opensvc.com,busybox.opensvc.com service service container container ip disk fs share app hb sync avail overall name type type status status status status status status status status status status frozen ------- ------- --------- --------- ------ ------ ------ ------ ------ ------ ------ ------ ------- ------ registry.opensvc.com DEV hosted down down down down n/a n/a warn up down warn False busybox.opensvc.com DEV hosted down down down down n/a n/a warn up down warn False root@deb2:/# svcmon --service=registry.opensvc.com,busybox.opensvc.com service service container container ip disk fs share app hb sync avail overall name type type status status status status status status status status status status frozen ------- ------- --------- --------- ------ ------ ------ ------ ------ ------ ------ ------ ------- ------ registry.opensvc.com DEV hosted down down down down n/a n/a warn up down warn False busybox.opensvc.com DEV hosted down down down down n/a n/a warn up down warn False Unfreeze service ``registry`` on node deb1.opensvc.com:: root@deb1:/# ez_svc_status => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate registry deb1.opensvc.com FROZEN_STOP deb2.opensvc.com FROZEN_STOP busybox deb1.opensvc.com FROZEN_STOP deb2.opensvc.com FROZEN_STOP root@deb1:/# $EZ_BIN/service -A registry unfreeze Ready to UNFREEZE, we are (FROZEN_STOP) Service registry UNFROZEN root@deb1:/# ez_svc_status => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate registry deb1.opensvc.com START_READY deb2.opensvc.com FROZEN_STOP busybox deb1.opensvc.com FROZEN_STOP deb2.opensvc.com FROZEN_STOP root@deb1:/# ez_svc_status => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate registry deb1.opensvc.com STARTING deb2.opensvc.com FROZEN_STOP busybox deb1.opensvc.com FROZEN_STOP deb2.opensvc.com FROZEN_STOP root@deb1:/# ez_svc_status => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate registry deb1.opensvc.com STARTED deb2.opensvc.com FROZEN_STOP busybox deb1.opensvc.com FROZEN_STOP deb2.opensvc.com FROZEN_STOP root@deb1:/# svcmon --service=registry.opensvc.com,busybox.opensvc.com service service container container ip disk fs share app hb sync avail overall name type type status status status status status status status status status status frozen ------- ------- --------- --------- ------ ------ ------ ------ ------ ------ ------ ------ ------- ------ registry.opensvc.com DEV hosted up up up up n/a n/a up up up up False busybox.opensvc.com DEV hosted down down down down n/a n/a warn up down warn False root@deb1:/# registry.opensvc.com print status registry.opensvc.com overall up |- avail up | |- container#1 .... up e9828ec4620a@registry:latest | |- vg#1 .... up vgregistry | |- fs#1 .... up /dev/mapper/vgregistry-lvregistryroot@/opt/registry.opensvc.com | |- fs#2 .... up /dev/mapper/vgregistry-lvdockerdata@/opt/registry.opensvc.com/dockerdata | |- fs#3 .... up /dev/mapper/vgregistry-lvregistrydata@/opt/registry.opensvc.com/registrydata | '- ip#1 .... up registry.opensvc.com@eth0 |- sync up | '- sync#i0 .... up rsync svc config to drpnodes, nodes '- hb up '- hb#0 .... up hb.openha root@deb1:/# registry.opensvc.com docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES e9828ec4620a registry:latest /bin/sh -c 'exec doc 47 hours ago Up 2 minutes 37.59.71.25:5000->5000/tcp registry.opensvc.com.container.1 .. note:: after unfreeze, we see different state changes, FROZEN_STOP=>START_READY=>STARTING=>STARTED. Once STARTED is achieved, the service is up and running Unfreeze service ``registry`` on node deb2.opensvc.com:: root@deb2:/# ez_svc_status => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate busybox deb1.opensvc.com FROZEN_STOP deb2.opensvc.com FROZEN_STOP registry deb1.opensvc.com STARTED deb2.opensvc.com FROZEN_STOP root@deb2:/# $EZ_BIN/service -A registry unfreeze Ready to UNFREEZE, we are (FROZEN_STOP) Service registry UNFROZEN root@deb2:/# ez_svc_status => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate busybox deb1.opensvc.com FROZEN_STOP deb2.opensvc.com FROZEN_STOP registry deb1.opensvc.com STARTED deb2.opensvc.com STOPPED root@deb2:/# svcmon --service=registry.opensvc.com,busybox.opensvc.com service service container container ip disk fs share app hb sync avail overall name type type status status status status status status status status status status frozen ------- ------- --------- --------- ------ ------ ------ ------ ------ ------ ------ ------ ------- ------ registry.opensvc.com DEV hosted down down down down n/a n/a down up down down False busybox.opensvc.com DEV hosted down down down down n/a n/a warn up down warn False .. note:: OpenHA service ``registry`` is now STARTED@deb1 and STOPPED@deb2. We can enable the second service ``busybox``, using the same procedure: Unfreeze service ``busybox`` on node deb1.opensvc.com:: root@deb1:/# ez_svc_status => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate registry deb1.opensvc.com STARTED deb2.opensvc.com STOPPED busybox deb1.opensvc.com FROZEN_STOP deb2.opensvc.com FROZEN_STOP root@deb1:/# $EZ_BIN/service -A busybox unfreeze Ready to UNFREEZE, we are (FROZEN_STOP) Service busybox UNFROZEN root@deb1:/# ez_svc_status => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate registry deb1.opensvc.com STARTED deb2.opensvc.com STOPPED busybox deb1.opensvc.com START_READY deb2.opensvc.com FROZEN_STOP root@deb1:/# ez_svc_status => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate registry deb1.opensvc.com STARTED deb2.opensvc.com STOPPED busybox deb1.opensvc.com STARTING deb2.opensvc.com FROZEN_STOP root@deb1:/# ez_svc_status => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate registry deb1.opensvc.com STARTED deb2.opensvc.com STOPPED busybox deb1.opensvc.com STARTED deb2.opensvc.com FROZEN_STOP root@deb1:/# svcmon --service=registry.opensvc.com,busybox.opensvc.com service service container container ip disk fs share app hb sync avail overall name type type status status status status status status status status status status frozen ------- ------- --------- --------- ------ ------ ------ ------ ------ ------ ------ ------ ------- ------ registry.opensvc.com DEV hosted up up up up n/a n/a up up up up False busybox.opensvc.com DEV hosted up up up up n/a n/a up up up up False root@deb1:/# busybox.opensvc.com print status busybox.opensvc.com overall up |- avail up | |- container#1 .... up c37b83887947@opensvc/busybox:date | |- container#2 .... up a189884f060a@opensvc/busybox:date | |- container#3 .... up fb2513423499@opensvc/busybox:date | |- container#4 .... up 3a7e545bced2@opensvc/busybox:date | |- container#5 .... up 1b6a3d00661e@opensvc/busybox:date | |- container#6 .... up 173a812f4ed5@opensvc/busybox:date | |- container#7 .... up 59fca3cd2745@opensvc/busybox:date | |- vg#1 .... up vgbusybox | |- fs#1 .... up /dev/mapper/vgbusybox-lvbusyboxroot@/opt/busybox.opensvc.com | |- fs#2 .... up /dev/mapper/vgbusybox-lvbusyboxdata@/opt/busybox.opensvc.com/appdata | '- ip#1 .... up busybox.opensvc.com@eth0 |- sync up | '- sync#i0 .... up rsync svc config to drpnodes, nodes '- hb up '- hb#0 .... up hb.openha root@deb1:/# busybox.opensvc.com docker attach 59fca3cd2745 Thu Jun 19 07:07:27 UTC 2014 Thu Jun 19 07:07:28 UTC 2014 Thu Jun 19 07:07:29 UTC 2014 Thu Jun 19 07:07:30 UTC 2014 root@deb1:/# busybox.opensvc.com docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 59fca3cd2745 opensvc/busybox:date /bin/sh -c 'while tr 47 hours ago Up About a minute busybox.opensvc.com.container.7 173a812f4ed5 opensvc/busybox:date /bin/sh -c 'while tr 47 hours ago Up About a minute busybox.opensvc.com.container.6 fb2513423499 opensvc/busybox:date /bin/sh -c 'while tr 47 hours ago Up About a minute busybox.opensvc.com.container.3 3a7e545bced2 opensvc/busybox:date /bin/sh -c 'while tr 47 hours ago Up About a minute busybox.opensvc.com.container.4 a189884f060a opensvc/busybox:date /bin/sh -c 'while tr 47 hours ago Up About a minute busybox.opensvc.com.container.2 1b6a3d00661e opensvc/busybox:date /bin/sh -c 'while tr 47 hours ago Up About a minute busybox.opensvc.com.container.5 c37b83887947 opensvc/busybox:date /bin/sh -c 'while tr 47 hours ago Up About a minute busybox.opensvc.com.container.1 Unfreeze service ``busybox`` on node deb2.opensvc.com:: root@deb2:/# ez_svc_status => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate busybox deb1.opensvc.com STARTED deb2.opensvc.com FROZEN_STOP registry deb1.opensvc.com STARTED deb2.opensvc.com STOPPED root@deb2:/# $EZ_BIN/service -A busybox unfreeze Ready to UNFREEZE, we are (FROZEN_STOP) Service busybox UNFROZEN root@deb2:/# ez_svc_status => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate busybox deb1.opensvc.com STARTED deb2.opensvc.com STOPPED registry deb1.opensvc.com STARTED deb2.opensvc.com STOPPED Cluster Failover ---------------- Two kind of events can trigger a cluster failover: * all heartbeat listeners do not receive peer notifications for timeout period (which is 10 seconds in our example) * one monitored ressource is not in the "up" status * this case is not described here, but in `High Availability Resource Monitoring `_ We will halt node deb1.opensvc.com and observe what happens from deb2.opensvc.com: * primary node halt is launched around Jun 19 09:31:57 * secondary node is running an infinite loop ``while true; do echo "-----------------------------"; ez_svc_status; ez_hb_status; date; sleep 5; done``, so as to fully understand what happens On node deb2.opensvc.com:: ----------------------------- => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate busybox deb1.opensvc.com STARTED deb2.opensvc.com STOPPED registry deb1.opensvc.com STARTED deb2.opensvc.com STOPPED => Querying OpenHA heartbeat status [/usr/local/cluster/bin/hb -s] net listener eth1:239.131.50.10 pid 2861 status UP, updated at Jun 19 09:31:55 net listener eth2:239.131.50.11 pid 2864 status UP, updated at Jun 19 09:31:55 net sender eth1:239.131.50.12 pid 17469 status UP, updated at Jun 19 09:31:55 net sender eth2:239.131.50.13 pid 17472 status UP, updated at Jun 19 09:31:55 dio listener /dev/mapper/disk.hb.0:0 pid 2873 status UP, updated at Jun 19 09:31:48 dio sender /dev/mapper/disk.hb.0:10 pid 17478 status UP, updated at Jun 19 09:31:55 Thu Jun 19 09:31:56 CEST 2014 .. note:: halt is launched just after this sample :: ----------------------------- => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate busybox deb1.opensvc.com STARTED deb2.opensvc.com STOPPED registry deb1.opensvc.com STARTED deb2.opensvc.com STOPPED => Querying OpenHA heartbeat status [/usr/local/cluster/bin/hb -s] net listener eth1:239.131.50.10 pid 2861 status UP, updated at Jun 19 09:31:57 net listener eth2:239.131.50.11 pid 2864 status UP, updated at Jun 19 09:31:57 net sender eth1:239.131.50.12 pid 17469 status UP, updated at Jun 19 09:32:00 net sender eth2:239.131.50.13 pid 17472 status UP, updated at Jun 19 09:32:00 dio listener /dev/mapper/disk.hb.0:0 pid 2873 status UP, updated at Jun 19 09:31:58 dio sender /dev/mapper/disk.hb.0:10 pid 17478 status UP, updated at Jun 19 09:32:00 Thu Jun 19 09:32:01 CEST 2014 .. note:: we can observe that listeners update time is now fixed to 09:31:57. This means that other node is halted, and no more sending heartbeats. :: ----------------------------- => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate busybox deb1.opensvc.com STARTED deb2.opensvc.com STOPPED registry deb1.opensvc.com STARTED deb2.opensvc.com STOPPED => Querying OpenHA heartbeat status [/usr/local/cluster/bin/hb -s] net listener eth1:239.131.50.10 pid 2861 status UP, updated at Jun 19 09:31:57 net listener eth2:239.131.50.11 pid 2864 status UP, updated at Jun 19 09:31:57 net sender eth1:239.131.50.12 pid 17469 status UP, updated at Jun 19 09:32:05 net sender eth2:239.131.50.13 pid 17472 status UP, updated at Jun 19 09:32:05 dio listener /dev/mapper/disk.hb.0:0 pid 2873 status UP, updated at Jun 19 09:31:58 dio sender /dev/mapper/disk.hb.0:10 pid 17478 status UP, updated at Jun 19 09:32:05 Thu Jun 19 09:32:06 CEST 2014 .. note:: node deb2.opensvc.com is still waiting for timeout expiring, which should occur at around 09:32:07 :: ----------------------------- => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate busybox deb1.opensvc.com UNKNOWN deb2.opensvc.com START_READY registry deb1.opensvc.com UNKNOWN deb2.opensvc.com START_READY => Querying OpenHA heartbeat status [/usr/local/cluster/bin/hb -s] net listener eth1:239.131.50.10 status DOWN net listener eth2:239.131.50.11 status DOWN net sender eth1:239.131.50.12 pid 17469 status UP, updated at Jun 19 09:32:10 net sender eth2:239.131.50.13 pid 17472 status UP, updated at Jun 19 09:32:10 dio listener /dev/mapper/disk.hb.0:0 status DOWN dio sender /dev/mapper/disk.hb.0:10 pid 17478 status UP, updated at Jun 19 09:32:10 Thu Jun 19 09:32:11 CEST 2014 .. note:: More than 10 seconds are elapsed since deb1.opensvc.com was halted. Node deb2.opensv.com tag its partner as UNKNOWN, for all services, and start service takeover. When cluster is configured with stonith features, it is the moment where other node is killed, so as to avoid split brain cluster. :: ----------------------------- => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate busybox deb1.opensvc.com UNKNOWN deb2.opensvc.com STARTING registry deb1.opensvc.com UNKNOWN deb2.opensvc.com STARTED => Querying OpenHA heartbeat status [/usr/local/cluster/bin/hb -s] net listener eth1:239.131.50.10 status DOWN net listener eth2:239.131.50.11 status DOWN net sender eth1:239.131.50.12 pid 17469 status UP, updated at Jun 19 09:32:15 net sender eth2:239.131.50.13 pid 17472 status UP, updated at Jun 19 09:32:15 dio listener /dev/mapper/disk.hb.0:0 status DOWN dio sender /dev/mapper/disk.hb.0:10 pid 17478 status UP, updated at Jun 19 09:32:15 Thu Jun 19 09:32:16 CEST 2014 .. note:: we can see that service starup is done in parallel in the "secstate" column :: ----------------------------- => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate busybox deb1.opensvc.com UNKNOWN deb2.opensvc.com STARTING registry deb1.opensvc.com UNKNOWN deb2.opensvc.com STARTED => Querying OpenHA heartbeat status [/usr/local/cluster/bin/hb -s] net listener eth1:239.131.50.10 status DOWN net listener eth2:239.131.50.11 status DOWN net sender eth1:239.131.50.12 pid 17469 status UP, updated at Jun 19 09:32:20 net sender eth2:239.131.50.13 pid 17472 status UP, updated at Jun 19 09:32:20 dio listener /dev/mapper/disk.hb.0:0 status DOWN dio sender /dev/mapper/disk.hb.0:10 pid 17478 status UP, updated at Jun 19 09:32:20 Thu Jun 19 09:32:21 CEST 2014 ----------------------------- => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate busybox deb1.opensvc.com UNKNOWN deb2.opensvc.com STARTING registry deb1.opensvc.com UNKNOWN deb2.opensvc.com STARTED => Querying OpenHA heartbeat status [/usr/local/cluster/bin/hb -s] net listener eth1:239.131.50.10 status DOWN net listener eth2:239.131.50.11 status DOWN net sender eth1:239.131.50.12 pid 17469 status UP, updated at Jun 19 09:32:25 net sender eth2:239.131.50.13 pid 17472 status UP, updated at Jun 19 09:32:25 dio listener /dev/mapper/disk.hb.0:0 status DOWN dio sender /dev/mapper/disk.hb.0:10 pid 17478 status UP, updated at Jun 19 09:32:25 Thu Jun 19 09:32:26 CEST 2014 .. note:: service busybox.opensvc.com is longer to start than service registry.opensvc.com :: ----------------------------- => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate busybox deb1.opensvc.com UNKNOWN deb2.opensvc.com STARTED registry deb1.opensvc.com UNKNOWN deb2.opensvc.com STARTED => Querying OpenHA heartbeat status [/usr/local/cluster/bin/hb -s] net listener eth1:239.131.50.10 status DOWN net listener eth2:239.131.50.11 status DOWN net sender eth1:239.131.50.12 pid 17469 status UP, updated at Jun 19 09:32:30 net sender eth2:239.131.50.13 pid 17472 status UP, updated at Jun 19 09:32:30 dio listener /dev/mapper/disk.hb.0:0 status DOWN dio sender /dev/mapper/disk.hb.0:10 pid 17478 status UP, updated at Jun 19 09:32:30 Thu Jun 19 09:32:31 CEST 2014 root@deb2:/# svcmon --service=registry.opensvc.com,busybox.opensvc.com service service container container ip disk fs share app hb sync avail overall name type type status status status status status status status status status status frozen ------- ------- --------- --------- ------ ------ ------ ------ ------ ------ ------ ------ ------- ------ registry.opensvc.com DEV hosted up up up up n/a n/a up up up up False busybox.opensvc.com DEV hosted up up up up n/a n/a up up up up False root@deb2:/# busybox.opensvc.com docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 59fca3cd2745 opensvc/busybox:date /bin/sh -c 'while tr 47 hours ago Up 17 seconds busybox.opensvc.com.container.7 173a812f4ed5 opensvc/busybox:date /bin/sh -c 'while tr 47 hours ago Up 17 seconds busybox.opensvc.com.container.6 fb2513423499 opensvc/busybox:date /bin/sh -c 'while tr 47 hours ago Up 17 seconds busybox.opensvc.com.container.3 3a7e545bced2 opensvc/busybox:date /bin/sh -c 'while tr 47 hours ago Up 17 seconds busybox.opensvc.com.container.4 a189884f060a opensvc/busybox:date /bin/sh -c 'while tr 47 hours ago Up 17 seconds busybox.opensvc.com.container.2 1b6a3d00661e opensvc/busybox:date /bin/sh -c 'while tr 47 hours ago Up 17 seconds busybox.opensvc.com.container.5 c37b83887947 opensvc/busybox:date /bin/sh -c 'while tr 47 hours ago Up 18 seconds busybox.opensvc.com.container.1 root@deb2:/# date Thu Jun 19 09:32:52 CEST 2014 .. note:: both services are now up and running on node deb2.opensvc.com From halt on primary node to service availability on secondary node, we can see a **1 minute delay**, mostly due to application startup times. **This setup ensure a robust and easy solution to environments that require high availability. Lots of clusters are currently running in production with OpenSVC & OpenHA.** Cluster Failback ---------------- When node deb1.opensvc.com is back, it joins the cluster and discover that secondary node deb2.opensvc.com owns services. :: root@deb1:~# ez_svc_status => Querying OpenHA services status [/usr/local/cluster/bin/service -s -c] service prinode pristate secnode secstate registry deb1.opensvc.com STOPPED deb2.opensvc.com STARTED busybox deb1.opensvc.com STOPPED deb2.opensvc.com STARTED root@deb1:~# ez_hb_status => Querying OpenHA heartbeat status [/usr/local/cluster/bin/hb -s] net sender eth1:239.131.50.10 pid 2914 status UP, updated at Jun 19 10:03:11 net sender eth2:239.131.50.11 pid 2941 status UP, updated at Jun 19 10:03:11 net listener eth1:239.131.50.12 pid 17469 status UP, updated at Jun 19 10:03:11 net listener eth2:239.131.50.13 pid 17472 status UP, updated at Jun 19 10:03:11 dio sender /dev/mapper/disk.hb.0:0 pid 3015 status UP, updated at Jun 19 10:03:11 dio listener /dev/mapper/disk.hb.0:10 pid 17478 status UP, updated at Jun 19 10:03:09 root@deb1:~# svcmon --service=registry.opensvc.com,busybox.opensvc.com service service container container ip disk fs share app hb sync avail overall name type type status status status status status status status status status status frozen ------- ------- --------- --------- ------ ------ ------ ------ ------ ------ ------ ------ ------- ------ registry.opensvc.com DEV hosted down down down down n/a n/a down up down down False busybox.opensvc.com DEV hosted down down down down n/a n/a down up down down False Production Use -------------- If you plan to use this setup in production, it is strongly recommended to: * enable scsi reservation on disks (add ``scsi_reserve = true`` in volume group resources for each service). This will prevent any data corruption that would happen if both nodes mounts filesystems at the same time. * implement STONITH as described in `High Availability Stonith `_. As soon as one secondary node has decided to takeover a service, it will stonith the other node to be sure to be the only node to run the OpenSVC service.