type
Post
status
Published
slug
2023/01/29/Configure-a-home-service-monitoring-center-with-Grafana
summary
tags
工具
Linux
docker-compose
category
技术分享
icon
password
new update day
Property
Oct 22, 2023 01:31 PM
created days
Last edited time
Oct 22, 2023 01:31 PM
因为上学期间太常时间没有使用家里的 ESXI 服务器,导致 root 密码忘记了,于是将原来的 ESXI 系统格式化为了 Ubuntu 20.04 系统,于是一开始的 Grafana + Prometheus 监控中心,也一并下线了。
最近为了提高对系统运行状态了解,以及保证服务运行的稳定,决定重新配置监控中心。在这里记录一下这个过程。
1 前期准备
1.1 安装运行环境
- 安装 prometheus
sudo apt install prometheus prometheus-alertmanager prometheus-node-exporter
- 安装 cadvisor
sudo apt install cadvisor
- 安装 docker
sudo apt install docker.io
1.2 配置 prometheus
- 配置 prometheus
/etc/prometheus/prometheus.yml
# Sample config for Prometheus. global: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. # scrape_timeout is set to the global default (10s). # Attach these labels to any time series or alerts when communicating with # external systems (federation, remote storage, Alertmanager). external_labels: monitor: 'home' # Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: ['localhost:9093'] # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: # - "first_rules.yml" # - "second_rules.yml" # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: 'prometheus' # Override the global default and scrape targets from this job every 5 seconds. scrape_interval: 5s scrape_timeout: 5s # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ['localhost:9090', 'localhost:9100', 'localhost:8081'] - job_name: node # If prometheus-node-exporter is installed, grab stats about the local # machine by default. static_configs: - targets: ['localhost:9100'] - job_name: traefik # If prometheus-node-exporter is installed, grab stats about the local # machine by default. static_configs: - targets: ['localhost:8082' ]
- 配置 cadvisor (cadvisor 默认监听端口为 8080 如果冲突可根据下面的示例修改)
/etc/default/cadvisor
# config options for cadvisor(1) # # Docker endpoint to connect to # Default: unix:///var/run/docker.sock CADVISOR_DOCKER_ENDPOINT="unix:///var/run/docker.sock" # Port to listen on # Default: 8080 CADVISOR_PORT="8081" # Storage driver # Default: none/blank # # Available Options: # - <empty> # - bigquery # - elasticsearch # - kafka # - redis # - statsd # - stdout CADVISOR_STORAGE_DRIVER="" # Storage driver host # Default: localhost:8086" CADVISOR_STORAGE_DRIVER_HOST="localhost:8086" # Storage driver password # Default: root CADVISOR_STORAGE_DRIVER_PASSWORD="root" # Storage driver secure connection # Default: false CADVISOR_STORAGE_DRIVER_SECURE="false" # Storage driver user # Default: root CADVISOR_STORAGE_DRIVER_USER="root" # Log to stderr ("true" logs to journal on systemd # and "false" to "/var/log/cadvisor.log" on SysV) # Default: true CADVISOR_LOG_TO_STDERR="true" # Other options: #DAEMON_ARGS=""
1.3 访问测试
url://ip:9090
1.4 启动 Grafana 容器
docker-compose up -d
- docker-compose.yaml
version: "2" services: grafana: image: grafana/grafana #ports: # - 3000:3000 expose: - 3000 user: "472" restart: always logging: options: max-size: 1m environment: - TZ="Asia/Shanghai" - GF_SMTP_ENABLED=true - GF_SMTP_HOST=smtp.163.com:465 - GF_SMTP_USER=xxxxxx@163.com - GF_SMTP_PASSWORD=xxxxxx - GF_SMTP_FROM_ADDRESS=xxxxxx@163.com labels: - "traefik.enable=true" - "traefik.http.routers.grafana.rule=Host(`home.example.org`) && Path(`/grafana`) || Host(`grafana.example.org`)" - "traefik.http.routers.grafana.entrypoints=websecure" - "traefik.http.routers.grafana.tls.certresolver=myresolver" - "traefik.http.services.grafana.loadbalancer.server.port=3000" volumes: # 数据持久化存储 - ./grafana_data:/var/lib/grafana - ./grafana/provisioning/:/etc/grafana/provisioning/ # 配置文件存放 #- ./grafana.ini:/etc/grafana/grafana.ini network_mode: host
1.5 访问测试
ip:3000
或者 traefik 配置的域名
2 配置 Grafana
2.1 配置数据源
将
http://localhost:9090
填入并进行保存即可。2.2 导入 Grafana 面版
在这里推荐几个官方分享的比较好的面版。
以及我自己写的一个关于 Traefik 的面版文件。
展示
2.3 告警设置
前面的 docker-compose 文件中已经配置好相应的 SMTP 的环境变量,现在我们需要在 Grafana 内部进行告警规则的设置。
现在以 Node Exporter Full 面版为例。
点击 Alert 分栏。点击下方的新建告警规则的蓝色按扭。这里我们以 CPU 的 Idle 为测量标准,即如果 Idle 时间少于 20%,则认为 CPU 繁忙。
其中 Idle 栏目是 F 栏,于是将 F 作为 G 的输入,并取最新值,然后输入到门限控制栏,选择小于 0.2 即可。
2.4 配置通知渠道
选择告警界面,连接点,配置 Email 通知的默认邮箱,可以多个邮箱,配置完成后可进行发信测试。
- 告警信息展示
欢迎加入“喵星计算机技术研究院”,原创技术文章第一时间推送。
- 作者:tangcuyu
- 链接:https://expoli.tech/articles/2023/01/29/Configure-a-home-service-monitoring-center-with-Grafana
- 声明:本文采用 CC BY-NC-SA 4.0 许可协议,转载请注明出处。
相关文章