
Please find the latest version of these slides at:
The master IP should:
version
is a good 'no op')gnt-cluster verify
output should not contain the word "ERROR"Keep long history of utilization for capacity planning, budgeting, and troubleshooting.
Provides information:
design doc: design-monitoring-agent.rst
mon-collector
: quick 'n dirty CLI toolNow:
Soon(-ish):
{ "name" : "TheCollectorIdentifier", "version" : "1.2", "format_version" : 1, "timestamp" : 1351607182000000000, "category" : null, "kind" : 0, "data" : { "plugin_specific_data" : "go_here" } }
name:
the name of the plugin. Unique string.version:
the version of the plugin. A string.format_version:
the version of the data
format of the plugin. Incremental integer.timestamp:
when the report was produced. Nanoseconds.
Can be zero-padded.They introduce a mandatory part inside the data
section.
"data" : { ... "status" : { "code" : <value> "message: "some summary goes here" } }
<value>:
by increasing criticality levelnode.example.com:1815
/ (return the list of supported protocol version) /1/list/collectors /1/report/all /1/report/[category]/[collector_name]
doc/design-reason-trail.rst
List of triples (source, reason, timestamp)
[("user", "Cleanup of unused instances", 1363088484000000000), ("gnt:client:gnt-instance", "stop", 1363088484020000000), ("gnt:opcode:shutdown", "job=1234;index=0", 1363088484026000000), ("gnt:daemon:noded:shutdown", "", 1363088484135000000)]
source:
the entity deciding to perform/forward the
command. Free form, but the gnt:
prefix is reservedreason:
why the entity decided to perform the operationtimestamp:
timestamp since epoch, in nanoseconds--reason
reason
parameter added to the request