
Please find the latest version of these slides at:
Where to put an instance? - let the cluster figure it out!
gnt-instance add [--iallocator hail] myinstance.example.com
hail
Read cluster configuration, calculate, and balance:
hbal -L -X
Read cluster configuration, calculate, don't execute:
hbal -L
Minimal moves to evacuate any "drained" nodes:
hbal -L --evac-mode -X
Migrate only. Don't move any disks:
hbal -L --no-disk-moves
Capacity planning
So simulate sequentially adding new machines
Use Luxi backend to get live cluster data
# hspace -L The cluster has 3 nodes and the following resources: MEM 196569, DSK 10215744, CPU 72, VCPU 288. There are 2 initial instances on the cluster. Tiered (initial size) instance spec is: MEM 1024, DSK 1048576, CPU 8, using disk template 'drbd'. Tiered allocation results: - 4 instances of spec MEM 1024, DSK 1048576, CPU 8 - 2 instances of spec MEM 1024, DSK 258304, CPU 8 - most likely failure reason: FailDisk - initial cluster score: 1.92199260 - final cluster score: 2.03107472 - memory usage efficiency: 3.26% - disk usage efficiency: 92.27% - vcpu usage efficiency: 18.40% [...]
One of the lesser known backends (hspace and hail)
Mainly for cluster planning
What if I bought 10 times more disks?
$ hspace --simulate=p,3,34052480,65523,24 \ > --disk-template=drbd --tiered-alloc=1048576,1024,8 The cluster has 3 nodes and the following resources: MEM 196569, DSK 102157440, CPU 72, VCPU 288. There are no initial instances on the cluster. Tiered (initial size) instance spec is: MEM 1024, DSK 1048576, CPU 8, using disk template 'drbd'. Tiered allocation results: - 33 instances of spec MEM 1024, DSK 1048576, CPU 8 - 3 instances of spec MEM 1024, DSK 1048576, CPU 7 - most likely failure reason: FailCPU - initial cluster score: 0.00000000 - final cluster score: 0.00000000 - memory usage efficiency: 18.75% - disk usage efficiency: 73.90% - vcpu usage efficiency: 100.00% [...]
When rebooting all nodes (e.g., kernel update), there are several things to take care of.
hroller suggests groups of nodes to be rebooted together.
By default, plan for live migration.
# hroller -L 'Node Reboot Groups' node-00,node-10,node-20,node-30 node-01,node-11,node-21,node-31
Also possible to only avoid primary/secondary reboots (--offline-maintenance
)
or to plan complete node evacuation (--full-evacuation
).
# hroller -L --full-evacuation 'Node Reboot Groups' node-01,node-11 node-00,node-10 node-20,node-30 node-21,node-31
For the full evacuation, moves can also be shown (--print-moves
). Typically, together with --one-step-only
.
# hroller -L --full-evacuation --print-moves --one-step-only 'First Reboot Group' node-01 node-11 inst-00 node-00 node-20 inst-00 node-00 node-10 inst-10 node-10 node-21 inst-11 node-10 node-00
Nodes to be considered can also be selected by tags. This allows reboots interleaved with other operations.
GROUP=`hroller --node-tags needsreboot --one-step-only --no-headers -L` for node in $GROUP; do gnt-node modify -D yes $node; done for node in $GROUP; do gnt-node migrate -f --submit $node; done # ... wait for migrate jobs to finish # reboot nodes in $GROUP # verify... for node in $GROUP; do gnt-node remove-tags $node needs-reboot; done for node in $GROUP; do gnt-node modify -D no $node; done hbal -L -X