Please find the latest version of these slides at:
Where to put an instance? - let the cluster figure it out!
gnt-instance add [--iallocator hail] myinstance.example.com
hailRead cluster configuration, calculate, and balance:
hbal -L -X
Read cluster configuration, calculate, don't execute:
hbal -L
Minimal moves to evacuate any "drained" nodes:
hbal -L --evac-mode -X
Migrate only. Don't move any disks:
hbal -L --no-disk-moves
Capacity planning
So simulate sequentially adding new machines
Use Luxi backend to get live cluster data
# hspace -L
The cluster has 3 nodes and the following resources:
MEM 196569, DSK 10215744, CPU 72, VCPU 288.
There are 2 initial instances on the cluster.
Tiered (initial size) instance spec is:
MEM 1024, DSK 1048576, CPU 8, using disk template 'drbd'.
Tiered allocation results:
- 4 instances of spec MEM 1024, DSK 1048576, CPU 8
- 2 instances of spec MEM 1024, DSK 258304, CPU 8
- most likely failure reason: FailDisk
- initial cluster score: 1.92199260
- final cluster score: 2.03107472
- memory usage efficiency: 3.26%
- disk usage efficiency: 92.27%
- vcpu usage efficiency: 18.40%
[...]
One of the lesser known backends (hspace and hail)
Mainly for cluster planning
What if I bought 10 times more disks?
$ hspace --simulate=p,3,34052480,65523,24 \
> --disk-template=drbd --tiered-alloc=1048576,1024,8
The cluster has 3 nodes and the following resources:
MEM 196569, DSK 102157440, CPU 72, VCPU 288.
There are no initial instances on the cluster.
Tiered (initial size) instance spec is:
MEM 1024, DSK 1048576, CPU 8, using disk template 'drbd'.
Tiered allocation results:
- 33 instances of spec MEM 1024, DSK 1048576, CPU 8
- 3 instances of spec MEM 1024, DSK 1048576, CPU 7
- most likely failure reason: FailCPU
- initial cluster score: 0.00000000
- final cluster score: 0.00000000
- memory usage efficiency: 18.75%
- disk usage efficiency: 73.90%
- vcpu usage efficiency: 100.00%
[...]
When rebooting all nodes (e.g., kernel update), there are several things to take care of.
hroller suggests groups of nodes to be rebooted together.
By default, plan for live migration.
# hroller -L
'Node Reboot Groups'
node-00,node-10,node-20,node-30
node-01,node-11,node-21,node-31
Also possible to only avoid primary/secondary reboots (--offline-maintenance)
or to plan complete node evacuation (--full-evacuation).
# hroller -L --full-evacuation
'Node Reboot Groups'
node-01,node-11
node-00,node-10
node-20,node-30
node-21,node-31
For the full evacuation, moves can also be shown (--print-moves). Typically, together with --one-step-only.
# hroller -L --full-evacuation --print-moves --one-step-only
'First Reboot Group'
node-01
node-11
inst-00 node-00 node-20
inst-00 node-00 node-10
inst-10 node-10 node-21
inst-11 node-10 node-00
Nodes to be considered can also be selected by tags. This allows reboots interleaved with other operations.
GROUP=`hroller --node-tags needsreboot --one-step-only --no-headers -L`
for node in $GROUP; do gnt-node modify -D yes $node; done
for node in $GROUP; do gnt-node migrate -f --submit $node; done
# ... wait for migrate jobs to finish
# reboot nodes in $GROUP
# verify...
for node in $GROUP; do gnt-node remove-tags $node needs-reboot; done
for node in $GROUP; do gnt-node modify -D no $node; done
hbal -L -X
