SlapOS Comp1 HA¶
Purpose¶
Nexedi SlapOS is a distributed, service oriented, operating system. It is composed of 2 kinds of components, master and slave nodes. SlapOS can afford the loss of master node for a few hours, but the loss of a slave worker node involves a service outage.
This survival guide describes how to deploy and manage a highly available SlapOS slave node, using OpenSVC cluster and Linbit DRBD data replication.
The software stack installation is automated using an Ansible playbook, which configures OpenSVC cluster, deploy Re6st component (ipv6 mesh network, needed by SlapOS), and then deploy SlapOS.
The SlapOS comp1 component is embedded in an OpenSVC service, allowing SlapOS administrator to move the service from one cluster node to another, or also to survive to a server issue (crash, poweroff, …)
Prerequisites¶
Cluster¶
2 nodes Ubuntu 22.04 LTS
hard disk drive sized for operating system usage + SlapOS needs
Re6st token (https://handbook.rapid.space/rapidspace-HowTo.Request.A.Freefib.Token)
SlapOS token (https://slapos.nexedi.com/slapos-Tutorial.Install.Slapos.Node.Comp.123)
Note
for a better understanding, the example logs hereafter will show cluster nodes with hostname demo1 and demo2
Ansible control node¶
Any os supporting ansible
Ansible >= 2.13
Note
for a better understanding, the example logs hereafter will show an ansible control node with hostname ansible
Installing¶
Download example playbook¶
On ansible control node:
user@ansible $ wget https://raw.githubusercontent.com/opensvc/ansible-collection-osvc/master/examples/slapos/playbook-slap-comp-1.yml
Avertissement
change the playbook tunables to suits your needs (tokens, hostname, size, …). Check the ansible role documentation.
Install Ansible prerequisites¶
The collection opensvc.app has to be installed on the control node.
On ansible control node:
user@ansible $ sudo ansible-galaxy collection install opensvc.app
Process install dependency map
Starting collection install process
Installing 'opensvc.app:1.0.0' to '/root/.ansible/collections/ansible_collections/opensvc/app'
Installing 'ansible.posix:1.5.4' to '/root/.ansible/collections/ansible_collections/ansible/posix'
Installing 'opensvc.cluster:1.2.1' to '/root/.ansible/collections/ansible_collections/opensvc/cluster'
Note
you may check the Ansible collection installation documentation and/or OpenSVC side container alternative
Prepare your ansible inventory file to match the target cluster nodes (see example below)¶
On ansible control node:
[clusternodes]
demo1.acme.com ansible_host="5.200.201.202" ansible_ssh_private_key_file="ssh.private.key" ansible_user=ubuntu ansible_become=true
demo2.acme.com ansible_host="5.200.201.203" ansible_ssh_private_key_file="ssh.private.key" ansible_user=ubuntu ansible_become=true
Running playbook¶
On ansible control node:
user@ansible $ sudo ansible-playbook -i inventory playbook-slap-comp-1.yml
The control node execute the main playbook, which deals with the following high level tasks
assemble the 2 nodes into an OpenSVC cluster
configure DRBD on both nodes
create the OpenSVC service for SlapOS comp1 feature
execute re6st playbook on first node
moves the OpenSVC service on second node
execute re6st playbook on second node
move back the service on first node
execute SlapOS playbook on first node
moves the OpenSVC service on second node
execute SlapOS playbook on second node
move back the service on first node
Note
a tarball containing playbook execution logs can be downloaded here
At the end of playbook execution, you should have an operational service:
Checking Status¶
Cluster¶
Cluster status can be checked with command om mon
Threads demo1 demo2 daemon running | hb#1.rx running [::]:10000 | / O hb#1.tx running | / O listener running :1214 monitor running scheduler running Nodes demo1 demo2 score | 69 70 load 15m | 0.0 0.0 mem | 15/98%:3.82g 9/98%:3.82g swap | - - state | */svc/* demo1 demo2 slapos/svc/comp1 up ha 1/1 | O^ S
Service¶
Service status can be checked with command om slapos/svc/comp1 print status
slapos/svc/comp1 up `- instances |- demo2 stdby up idle `- demo1 up idle, started |- volume#0 ........ up comp1-cfg |- disk#0 ......S. stdby up loop /opt/comp1.slapos.svc.hyperopenx.img |- disk#1 ......S. stdby up vg comp1.slapos.svc.hyperopenx |- disk#2 ......S. stdby up lv comp1.slapos.svc.hyperopenx/comp1 |- disk#3 ......S. stdby up drbd comp1.slapos.svc.hyperopenx | info: Primary |- fs#0 ........ up ext4 /dev/drbd0@/srv/comp1.slapos.svc.hyperopenx |- fs#flag ........ up fs.flag |- fs:binds | |- fs#1 ........ up bind /srv/comp1.slapos.svc.hyperopenx/re6st/etc/re6stnet@/etc/re6stnet | |- fs#2 ........ up bind /srv/comp1.slapos.svc.hyperopenx/re6st/var/log/re6stnet@/var/log/re6stnet | |- fs#3 ........ up bind /srv/comp1.slapos.svc.hyperopenx/re6st/var/lib/re6stnet@/var/lib/re6stnet | |- fs#4 ........ up bind /srv/comp1.slapos.svc.hyperopenx/slapos/srv/slapgrid@/srv/slapgrid | `- fs#5 ........ up bind /srv/comp1.slapos.svc.hyperopenx/slapos/etc/opt@/etc/opt |- app:re6st | `- app#0 ...../.. up forking: re6st |- app:slapos | `- app#1 ...../.. up forking: slapos |- sync#i0 ...O./.. up rsync svc config to nodes `- task:admin // |- task#addpart ...O./.. up task.host |- task#chkaddip ...O./.. up task.host |- task#collect ...O./.. up task.host |- task#delpart ...O./.. up task.host `- task#software ...O./.. up task.host
Note
add option -r
to force immediate ressource status evaluation (om slapos/svc/comp1 print status -r
)
Tasks¶
SlapOS component need cron jobs to be executed. They have been integrated into OpenSVC tasks.
Tasks schedule can be displayed with om slapos/svc/comp1 print schedule
Action Last Run Next Run Config Parameter Schedule Definition |- compliance_auto - 2023-11-10 03:48:52 DEFAULT.comp_schedule ~00:00-06:00 |- push_resinfo - 2023-11-09 14:34:16 DEFAULT.resinfo_schedule @60 |- status 2023-11-09 14:25:36 2023-11-09 14:35:36 DEFAULT.status_schedule @10 |- run 2023-11-09 14:34:10 2023-11-09 14:35:10 task#addpart.schedule @1m |- run 2023-11-09 14:28:10 2023-11-09 15:28:10 task#chkaddip.schedule @60m |- run 2023-11-09 14:34:10 2023-11-09 14:35:10 task#collect.schedule @1m |- run 2023-11-09 14:28:10 2023-11-09 15:28:10 task#delpart.schedule @60m |- run 2023-11-09 14:34:10 2023-11-09 14:35:10 task#software.schedule @1m `- sync_all 2023-11-09 14:05:58 2023-11-09 15:05:58 sync#i0.schedule @60
Management commands¶
Starting service¶
om slapos/svc/comp1 start
Relocating service¶
om slapos/svc/comp1 switch
Stopping service¶
om slapos/svc/comp1 stop
Fetching service config¶
om slapos/svc/comp1 print config
Editing service config¶
om slapos/svc/comp1 edit config
Notes¶
This deployment is still work in progress and need to be reworked
add more storage options
check ipv6 routes prerequisite for slapos installer
container implementation (lxc ? docker?)
configure api for external management
add more heartbeats
…