
Please find the latest version of these slides at:
gnt-node add <name>
# gnt-group add group2 # gnt-group rename default group1 # gnt-group assign-nodes group2 node20 node21 node22 ... # gnt-instance change-group --to group1 instance_name
# on a master candidate gnt-cluster master-failover # use --no-voting on a 2 node cluster
(A linux-HA experimental integration is present in 2.7)
We can remove instances from a node when we want to perform some maintenance.
Drain, move instances, check, set off-line:
gnt-node modify -D yes node2 # mark as "drained" gnt-node migrate node2 # migrate instances gnt-node evacuate node2 # remove DRBD secondaries gnt-node info node2 # check your work gnt-node modify -O yes node2 # mark as "offline"
It is now safe to power off node2
Set the node offline:
gnt-node modify -O yes node3
Use --auto-promote
or manually promote a node if the node was a master candidate.
(This step can also be automated using linux-HA)
gnt-node failover --ignore-consistency node3
or, for each instance:
gnt-instance failover --ignore-consistency web
gnt-node evacuate -I hail node3
or, for each instance:
gnt-instance replace-disks {-n node1 | -I hail } web
(The autorepair tool in Ganeti 2.7 can automate these two steps)
After a node comes back:
gnt-node add --readd node3
Then it's a good idea to rebalance the cluster:
hbal -L -X
Shutting/Starting down all instances:
gnt-instance stop|start --all [--no-remember]
Blocking/Unblocking jobs:
gnt-cluster queue [un]drain
Stopping the watcher:
gnt-cluster watcher pause <timespec>|continue
Graceful shutdown before powering off nodes:
gnt-cluster verify gnt-cluster watcher pause 6000 gnt-instance stop --all --no-remember gnt-cluster queue drain gnt-job list --running # Check if jobs have completed
Emergency shutdown (faster):
gnt-instance stop --all --no-remember
After a graceful shutdown, return the cluster to service:
gnt-cluster queue undrain gnt-cluster watcher continue
The watcher will restart all instances in 10-20 minutes:
gnt-cluster verify
From the master node:
alias gnt-dsh=dsh -cf /var/lib/ganeti/ssconf_online_nodes
Stop Ganeti:
gnt-dsh /etc/init.d/ganeti stop
Now unpack/upgrade the new version on all nodes. eg:
gnt-dsh apt-get install ganeti2=2.7.1-1 ganeti-htools=2.7.1-1
Now upgrade the config and restart
/usr/lib/ganeti/tools/cfgupgrade gnt-dsh /etc/init.d/ganeti start gnt-cluster redist-conf
Regarding upgrades, we are currently (as of 2.10) working on upgrading Ganeti from inside Ganeti, to make upgrading smoother.
For more best practices, see the Ganeti administrator's guide