The metric and label conventions presented in this document are not required for using Prometheus, but can serve as both a style-guide and a collection of best practices. Individual organizations may want to approach some of these practices, e.g. naming conventions, differently.
A metric name...
namespace
by
client libraries. For metrics specific to an application, the prefix is
usually the application name itself. Sometimes, however, metrics are more
generic, like standardized metrics exported by client libraries. Examples:
prometheus_notifications_total
(specific to the Prometheus server)process_cpu_seconds_total
(exported by many client libraries)http_request_duration_seconds
(for all HTTP requests)total
as a suffix, in addition to the unit if applicable. Also note that this applies to units in the narrow sense (like the units in the table below), but not to countable things in general. For example, connections
or notifications
are not considered units for this rule and do not have to be at the end of the metric name. (See also examples in the next paragraph.)
http_request_duration_seconds
node_memory_usage_bytes
http_requests_total
(for a unit-less accumulating count)process_cpu_seconds_total
(for an accumulating count with unit)foobar_build_info
(for a pseudo-metric that provides metadata about the running binary)data_pipeline_last_record_processed_timestamp_seconds
(for a timestamp that tracks the time of the latest record processed in a data processing pipeline)prometheus_tsdb_head_truncations_closed_total
prometheus_tsdb_head_truncations_established_total
prometheus_tsdb_head_truncations_failed_total
prometheus_tsdb_head_truncations_total
The following examples are also valid, but are following a different trade-off. They are easier to read individually, but unrelated metrics like prometheus_tsdb_head_series
might get sorted in between.
* prometheus_tsdb_head_closed_truncations_total
* prometheus_tsdb_head_established_truncations_total
* prometheus_tsdb_head_failed_truncations_total
* prometheus_tsdb_head_truncations_total
* ...should represent the same logical thing-being-measured across all label
dimensions.
* request duration
* bytes of data transfer
* instantaneous resource usage as a percentage
As a rule of thumb, either the sum()
or the avg()
over all dimensions of a
given metric should be meaningful (though not necessarily useful). If it is not
meaningful, split the data up into multiple metrics. For example, having the
capacity of various queues in one metric is good, while mixing the capacity of a
queue with the current number of elements in the queue is not.
Use labels to differentiate the characteristics of the thing that is being measured:
api_http_requests_total
- differentiate request types: operation="create|update|delete"
api_request_duration_seconds
- differentiate request stages: stage="extract|transform|load"
Do not put the label names in the metric name, as this introduces redundancy and will cause confusion if the respective labels are aggregated away.
Prometheus does not have any units hard coded. For better compatibility, base units should be used. The following lists some metrics families with their base unit. The list is not exhaustive.
Family | Base unit | Remark |
---|---|---|
Time | seconds | |
Temperature | celsius | celsius is preferred over kelvin for practical reasons. kelvin is acceptable as a base unit in special cases like color temperature or where temperature has to be absolute. |
Length | meters | |
Bytes | bytes | |
Bits | bytes | To avoid confusion combining different metrics, always use bytes, even where bits appear more common. |
Percent | ratio | Values are 0–1 (rather than 0–100). ratio is only used as a suffix for names like disk_usage_ratio . The usual metric name follows the pattern A_per_B . |
Voltage | volts | |
Electric current | amperes | |
Energy | joules | |
Power | Prefer exporting a counter of joules, then rate(joules[5m]) gives you power in Watts. |
|
Mass | grams | grams is preferred over kilograms to avoid issues with the kilo prefix. |
This documentation is open-source. Please help improve it by filing issues or pull requests.