prometheus client

86565656 · 发表于 2018-9-20 09:03:29

序言
　　Prometheus是一个开源的监控系统，拥有许多Advanced Feature，他会定期用HTTP协议来pull所监控系统状态进行数据收集，在加上timestamp等数据组织成time series data，用metric name和label来标识不同的time series,用户可以将数据用可视化工具显示出来，并设置报警阈值进行报警。
　　
本文将介绍Primetheus client的使用，基于golang语言，golang client 是当pro收集所监控的系统的数据时，用于响应pro的请求，按照一定的格式给pro返回数据，说白了就是一个http server，源码参见github,相关的文档参见GoDoc,读者可以直接阅读文档进行开发，本文只是帮助理解。

基础
　　要想学习pro golang client，需要有一个进行测试的环境，笔者建议使用prometheus的docker环境，部署迅速，对于系统没有影响，安装方式参见Using Docker，需要在本地准备好Pro的配置文件prometheus.yml，然后以volme的方式映射进docker，配置文件中的内容如下：
　　

global:　　scrape_interval: 15s # By default, scrape targets every 15 seconds.
　　

　　# Attach these labels to any time series or alerts when communicating with
　　# external systems (federation, remote storage, Alertmanager).
　　external_labels:
　　monitor: 'codelab-monitor'
　　

　　
# A scrape configuration containing exactly one endpoint to scrape:
　　
# Here it's Prometheus itself.
　　
scrape_configs:
　　# The job name is added as a label `job=` to any timeseries scraped from this config.
　　
- job_name: "go-test"
　　scrape_interval: 60s
　　scrape_timeout: 60s
　　metrics_path: "/metrics"
　　

　　static_configs:
　　- targets: ["localhost:8888"]
　　

　　可以看到配置文件中指定了一个job_name，所要监控的任务即视为一个job, scrape_interval和scrape_timeout是pro进行数据采集的时间间隔和频率，matrics_path指定了访问数据的http路径，target是目标的ip:port,这里使用的是同一台主机上的8888端口。此处只是基本的配置，更多信息参见官网。
　　
配置好之后就可以启动pro服务了：
　　
docker run --network=host -p 9090:9090 -v /home/gaorong/project/prometheus_test/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus
　　
此处网络通信采用的是host模式，所以docker中的pro可以直接通过localhost来指定同一台主机上所监控的程序。prob暴露9090端口进行界面显示或其他操作，需要对docker中9090端口进行映射。启动之后可以访问web页面http://localhost:9090/graph,在status下拉菜单中可以看到配置文件和目标的状态，此时目标状态为DOWN，因为我们所需要监控的服务还没有启动起来，那就赶紧步入正文，用pro golang client来实现程序吧。
　　

四种数据类型
　　pro将所有数据保存为timeseries data，用metric name和label区分，label是在metric name上的更细维度的划分，其中的每一个实例是由一个float64和timestamp组成，只不过timestamp是隐式加上去的，有时候不会显示出来，如下面所示(数据来源于pro暴露的监控数据，访问http://localhost:9090/metrics 可得），其中go_gc_duration_seconds是metrics name,quantile="0.5"是key-value pair的label，而后面的值是float64 value。
　　
pro为了方便client library的使用提供了四种数据类型： Counter, Gauge, Histogram, Summary, 简单理解就是Counter对数据只增不减，Gauage可增可减，Histogram,Summary提供跟多的统计信息。下面的实例中注释部分# TYPE go_gc_duration_seconds summary 标识出这是一个summary对象。
　　

# HELP go_gc_duration_seconds A summary of the GC invocation durations.　　
# TYPE go_gc_duration_seconds summary
　　
go_gc_duration_seconds{quantile="0.5"} 0.000107458
　　
go_gc_duration_seconds{quantile="0.75"} 0.000200112
　　
go_gc_duration_seconds{quantile="1"} 0.000299278
　　
go_gc_duration_seconds_sum 0.002341738
　　
go_gc_duration_seconds_count 18
　　
# HELP go_goroutines Number of goroutines that currently exist.
　　
# TYPE go_goroutines gauge
　　
go_goroutines 107
　　

　　A Basic Example 演示了使用这些数据类型的方法（注意将其中8080端口改为本文的8888）
　　

package main　　

　　
import (
　　"log"
　　"net/http"
　　

　　"github.com/prometheus/client_golang/prometheus"
　　"github.com/prometheus/client_golang/prometheus/promhttp"
　　
)
　　

　　
var (
　　cpuTemp = prometheus.NewGauge(prometheus.GaugeOpts{
　　Name: "cpu_temperature_celsius",
　　Help: "Current temperature of the CPU.",
　　})
　　hdFailures = prometheus.NewCounterVec(
　　prometheus.CounterOpts{
　　Name: "hd_errors_total",
　　Help: "Number of hard-disk errors.",
　　},
　　[]string{"device"},
　　)
　　
)
　　

　　
func init() {
　　// Metrics have to be registered to be exposed:
　　prometheus.MustRegister(cpuTemp)
　　prometheus.MustRegister(hdFailures)
　　
}
　　

　　
func main() {
　　cpuTemp.Set(65.3)
　　hdFailures.With(prometheus.Labels{"device":"/dev/sda"}).Inc()
　　

　　// The Handler function provides a default handler to expose metrics
　　// via an HTTP server. "/metrics" is the usual endpoint for that.
　　http.Handle("/metrics", promhttp.Handler())
　　log.Fatal(http.ListenAndServe(":8888", nil))
　　
}
　　

　　

　　其中创建了一个gauge和CounterVec对象，并分别指定了metric name和help信息，其中CounterVec是用来管理相同metric下不同label的一组Counter，同理存在GaugeVec，可以看到上面代码中声明了一个lable的key为“device”，使用的时候也需要指定一个lable: hdFailures.With(prometheus.Labels{"device":"/dev/sda"}).Inc()。
　　
变量定义后进行注册，最后再开启一个http服务的8888端口就完成了整个程序，pro采集数据是通过定期请求该服务http端口来实现的。
　　
启动程序之后可以在web浏览器里输入http://localhost:8888/metrics 就可以得到client暴露的数据，其中有片段显示为：
　　

# HELP cpu_temperature_celsius Current temperature of the CPU.　　
# TYPE cpu_temperature_celsius gauge
　　
cpu_temperature_celsius 65.3
　　

　　
# HELP hd_errors_total Number of hard-disk errors.
　　
# TYPE hd_errors_total counter
　　
hd_errors_total{device="/dev/sda"} 1
　　

　　上图就是示例程序所暴露出来的数据，并且可以看到counterVec是有label的,而单纯的gauage对象却不用lable标识，这就是基本数据类型和对应Vec版本的差别。此时再查看http://localhost:9090/graph 就会发现服务状态已经变为UP了。
　　
上面的例子只是一个简单的demo,因为在prometheus.yml配置文件中我们指定采集服务器信息的时间间隔为60s，每隔60s pro会通过http请求一次自己暴露的数据，而在代码中我们只设置了一次gauge变量cupTemp的值，如果在60s的采样间隔里将该值设置多次，前面的值就会被覆盖，只有pro采集数据那一刻的值能被看到，并且如果不再改变这个值，pro就始终能看到这个恒定的变量，除非用户显式通过Delete函数删除这个变量。
　　
使用Counter,Gauage等这些结构比较简单，但是如果不再使用这些变量需要我们手动删除，在某些情况下，我们无法知道自己曾经定义过哪些变量，这样的话，这些变量就会一直保存在pro中，这时候就需要我们自定义collector来实现了。

自定义Collector
　　go client Colletor只会在每次响应pro请求的时候才收集数据，并且需要每次显式传递变量的值，否则就不会再维持该变量，在pro也将看不到这个变量，Collector是一个接口，所有收集metrics数据的对象都需要实现这个接口，Counter和Gauage等不例外，它内部提供了两个函数，Collector用于收集用户数据，将收集好的数据传递给传入参数Channel就可，Descirbe函数用于描述这个Collector。当收集系统数据代价较大时，就可以自定义Collector收集的方式，优化流程，并且在某些情况下如果已经有了一个成熟的metrics，就不需要使用Counter,Gauage等这些数据结构，直接在Collector内部实现一个代理的功能即可，一些高阶的用法都可以通过自定义Collector实现。
　　

package main　　

　　
import (
　　"github.com/prometheus/client_golang/prometheus"
　　"github.com/prometheus/client_golang/prometheus/promhttp"
　　"net/http"
　　
)
　　

　　
type ClusterManager struct {
　　Zone string
　　OOMCountDesc *prometheus.Desc
　　RAMUsageDesc *prometheus.Desc
　　// ... many more fields
　　
}
　　

　　
// Simulate prepare the data
　　
func (c *ClusterManager) ReallyExpensiveAssessmentOfTheSystemState() (
　　oomCountByHost map[string]int, ramUsageByHost map[string]float64,
　　
) {
　　// Just example fake data.
　　oomCountByHost = map[string]int{
　　"foo.example.org": 42,
　　"bar.example.org": 2001,
　　}
　　ramUsageByHost = map[string]float64{
　　"foo.example.org": 6.023e23,
　　"bar.example.org": 3.14,
　　}
　　return
　　
}
　　

　　
// Describe simply sends the two Descs in the struct to the channel.
　　
func (c *ClusterManager) Describe(ch chan

账号		自动登录	找回密码
密码			立即注册

wirelessnetview好用的无线分析工具

亿图图示专家(EDraw Max) V7.9 中文破解版

zabbix3.4.1安装部署+微信推送信息+大屏显

Red Hat OpenShift I: Containers & Kubern

2025 年，C++ 还能“硬核”多久？

RH199 RHCSA Rapid Track

Red Hat RHCE 8 (EX294) Cert Guide

[资源发布] prometheus client

浏览过的版块

扫码加入运维网微信交流群