/ Docker

Docker Swarm Monitoring - Part 01 (Node-exporter, Prometheus, and Grafana)

Docker Swarm Monitoring - Part 01 (Node-exporter, Prometheus, and Grafana)

An effective monitoring system can be built across a Docker Swarm cluster using services managed by swarm itself. Starting with the prometheus node-exporter to gather system info from all host machines running Docker in swarm mode. Mount the system's directories as docker volumes to accomplish read access. Prometheus exporter gathers system info such as CPU, memory, and disk usage and exports it to a website that Prometheus server can then scrape every 15 seconds and fill a Time Series Data Base. With those 2 services in place, Grafana can then be pointed at the Prometheus server to build beautiful graphs and dashboards!

Prerequisites:

Docker Swarm uses Compose v3 and uses a docker-stack.yml file, much like the docker-compose.yml files designed to be used with the docker-compose tool, which use Compose v2. One of the biggest differences you'll run into when starting services with docker stack deploy over docker-compose up/down is that docker swarm creates a Routing Mesh for you, where as with docker-compose networks and containers have to be explicitly created and linked. In swarm mode, the link: is no longer needed. Services can be included in the same stack file and, by default, be created in the same network stack at deploy time, allowing docker containers to call each other by service name. This network can then be used by other stacks and future services by calling it in the stack file and assigning a service to it. This makes it easy to keep containers on their own isolated network or to cluster certain services like metrics and logging tools together on the same network.

Here is a Compose v3 docker-stack.yml file for this project that will start three services: Grafana, Prometheus server, and Prometheus node-exporter.

docker-stack.yml

version: '3'

services:

  exporter:
    image: prom/node-exporter:latest
    ports:
      - '9100:9100'
    volumes:
      - /sys:/host/sys:ro
      - /:/rootfs:ro
      - /proc:/host/proc:ro
    deploy:
      mode: global

  prometheus:
    image: prom/prometheus:latest
    ports:
      - '9090:9090'
    volumes:
      - ./data/etc/prometheus.yml:/etc/prometheus/prometheus.yml:ro
      - ./data/prometheus:/prometheus:rw
    deploy:
      mode: replicated
      replicas: 1

  grafana:
    image: grafana/grafana
    ports:
      - "3000:3000"
    volumes:
      - ./data/grafana:/var/lib/grafana:rw
    deploy:
      mode: replicated
      replicas: 1

Directory creation needs to be done before deploying this stack. A Makefile has been included to handle config, build, deploy, destroy operations and should be used as a reference for the commands that will build this thing.

make help                                                                              

config:    Copy prometheus.yml to config dir
dir:       Create directories
update:    Pull latest docker images
deploy:    Deploy to docker swarm
destroy:   Docker stack rm && rm -rf data
help:      This help dialog

With what's in the source code the stack can be started with:

sudo make

Prometheus

Exporter

Browse to the Prometheus node-exporter docs up on github and you'll see a few lines at the bottom of the readme that say how to run this in docker that look like this.

docker run -d \
  --net="host" \
  --pid="host" \
  quay.io/prometheus/node-exporter

Start by creating the stack file with just this entry. Take the image name from the docs and add it the stack file. The volumes in the stack file are mounted for prometheus to read. deploy: mode: global is saying that this service will be started on every node in the swarm cluster. Outputs to localhost:9100/

docker-stack.yml

version: '3'

services:

  exporter:
    image: prom/node-exporter:latest
    ports:
      - '9100:9100'
    volumes:
      - /sys:/host/sys:ro
      - /:/rootfs:ro
      - /proc:/host/proc:ro
    deploy:
      mode: global

Start this up with the docker stack deploy command

docker stack deploy -c docker-stack.yml monitor

Creating network monitor_default
Creating service monitor_exporter

This can also be kicked off with the Makefile

make deploy

Updating service monitor_exporter (id: ivbddqpnjr7sdxre0gzopney9)

Check the service

docker service ps monitor_exporter

ID                  NAME                                         IMAGE                       NODE                DESIRED STATE       CURRENT STATE                ERROR               PORTS
cp7o7s9t33s6        monitor_exporter.76g7crzb0hk6jp9zysvegmupy   prom/node-exporter:latest   localhost           Running             Running about a minute ago

Check the logs

docker service logs monitor_exporter
...
...
[email protected]    | time="2018-03-28T08:14:47Z" level=info msg="Listening on :9100" source="node_exporter.go:76"

Browse localhost:9100/ and check it out.

node_exporter

Server

Next, start up the Prometheus Server. This will scrape the exporter at a (10 second interval) set in the prometheus.yml configuration file. This file will be configured locally and copied to the image as a volume at run time. Volumes will also be used for persistent tsdb data in case of a container restart or failure.

docker-stack.yml

prometheus:
  image: prom/prometheus:latest
  ports:
    - '9090:9090'
  volumes:
    - ./data/etc/prometheus.yml:/etc/prometheus/prometheus.yml:ro
    - ./data/prometheus:/prometheus:rw
  deploy:
    mode: replicated
    replicas: 1

Prepare directories for mounting docker volumes. These will need read/write permissions for the default prometheus container user, which is nobody:nobody.

DATA_DIR="./data"

mkdir -p \
  "$DATA_DIR/etc" \
  "$DATA_DIR/grafana" \
  "$DATA_DIR/prometheus"

chmod 777 "$DATA_DIR/prometheus"
chown -R nobody:nobody "$DATA_DIR/prometheus"

Volumes are configured in the docker-stack.yml file. The first one is where prometheus will write it's database to. Secondly, prometheus mounts the prometheus.yml file which will come in handy later, when I start deploying this with jenkins later, because it let's me edit this file and reconfigure prometheus at deploy time.

  • ./data/prometheus:/prometheus:rw
  • ./data/etc/prometheus.yml:/etc/prometheus/prometheus.yml:ro

Check out your prometheus.yml file and make sure the exporter is added as a scrape target. This is how targets will be added in the future. Like Cadvisor and mysql-exporter.

prometheus.yml

global:
  scrape_interval:     30s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 30s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    static_configs:
    - targets:
      - localhost:9090

  # http://exporter:9100/metrics
  - job_name: prometheus-exporter
    scrape_interval: 10s
    metrics_path: "/metrics"
    static_configs:
    - targets:
       - exporter:9100

Use make to deploy and it will copy this config file to where it needs to go.

make config

config:
    @cp prometheus.yml $(DATA_DIR)/etc/prometheus.yml

With the prometheus server service added to the docker-stack.yml file and everything configured, redeploy the stack to add the new service.

docker stack deploy -c docker-stack.yml monitor

Creating service monitor_prometheus
Updating service monitor_exporter (id: ivbddqpnjr7sdxre0gzopney9)

Browse to localhost:9090/targets to verify connectivity.

prometheus_targets

That the server successfully scraping system data, it's now possible to query the tsdb

node_cpu{cpu="cpu0"}
node_cpu{cpu="cpu0",mode="idle"}
etc...

prometheus_query_01

Grafana

With Prometheus up and running, it's time to start Grafana. A volume is created for persistent data. Will output to localhost:3000/

docker-stack.yml

grafana:
  image: grafana/grafana
  ports:
    - "3000:3000"
  volumes:
    - ./data/grafana:/var/lib/grafana:rw
  deploy:
    mode: replicated
    replicas: 1

Deploy to start Grafana

docker stack deploy -c docker-stack.yml monitor                                        

Updating service monitor_exporter (id: ivbddqpnjr7sdxre0gzopney9)
Updating service monitor_prometheus (id: q4f07qz2tk3dvic9kc21sa3kq)
Creating service monitor_grafana

Browse to localhost:3000/login

The default user and password are: admin admin

grafana_login

Add a data source
config_add_data_source

Chose prometheus from the drop down
config_dropdown_prometheus

I used the IP from the host machine in this example
config_host_ip

Go to grafana.com/dashboards to check out the thousands of pre-made dashboards that are out there and find one that will work as a template to build on. A good one to start with in this project is the node exporter metrics on docker swarm mode dashboard or 1442

Import this dashboard to Grafana
import_dashboard_01

Chose prometheus as data source and hit Import
import_dashboard_01

And dashboard
complete_dashboard

With that, a very flexible monitoring system has been established across the swarm cluster! A lot can be done to add to it easily, with new data sources and dashboards.

Bring it all down with

sudo make destroy

Removing service monitor_exporter
Removing service monitor_grafana
Removing service monitor_prometheus
Removing network monitor_default
jahrik

jahrik

Self-taught. multilingual programmer working in DevOps automating Linux server administration and deployment with the use of tools like Chef, Ansible, and Docker Swarm.

Read More