dev prometheus

2022-02-22 20:03:17 +01:00
parent 8378dcb9e4
commit 2dd1678836
5 changed files with 107 additions and 0 deletions
--- a/content/posts/04-build-your-own-docker-swarm-cluster-part-3/index.md
+++ b/content/posts/04-build-your-own-docker-swarm-cluster-part-3/index.md
@ -449,6 +449,17 @@ It's time to create your admin account through <https://portainer.sw.okami101.io
 If you go to the stacks menu, you will note that both `traefik` and `portainer` are *Limited* control, because these stacks were done outside Portainer. We will create and deploy next stacks directly from Portainer GUI.
 {{< /alert >}}

+## CLI tools
+
+[`ctop`](https://github.com/bcicen/ctop) is a very useful CLI tools that works like `htop` but dedicated for docker containers. Install it on every docker hosts :
+
+```sh
+echo "deb http://packages.azlux.fr/debian/ buster main" | sudo tee /etc/apt/sources.list.d/azlux.list
+wget -qO - https://azlux.fr/repo.gpg.key | sudo apt-key add -
+sudo apt update
+sudo apt install -y docker-ctop
+```
+
 ## Keep the containers image up-to-date ⬆️

 It's finally time to test our new cluster environment by testing some images through the Portainer GUI. We'll start by installing [`Diun`](https://crazymax.dev/diun/), a very useful tool which notify us when used docker images has available update in his Docker registry.
--- a/content/posts/06-build-your-own-docker-swarm-cluster-part-5/index.md
+++ b/content/posts/06-build-your-own-docker-swarm-cluster-part-5/index.md
@ -12,10 +12,106 @@ Build your own cheap while powerful self-hosted complete CI/CD solution by follo

 This is the **Part V** of more global topic tutorial. [Back to first part]({{< ref "/posts/02-build-your-own-docker-swarm-cluster" >}}) to start from beginning.

+{{< alert >}}
+This part is totally optional, as it's mainly focused on monitoring. Feel free to skip this part.
+{{< /alert >}}
+
 ## Metrics with Prometheus 🔦

+Prometheus is become the standard de facto for self-hosted monitoring in part thanks to his architecture. It's a TSDB (Time Series Database) that will poll (aka scrape) standard metrics REST endpoints, provided by the tools to monitor. It's the case of Traefik, as we seen in [part III]({{< ref "04-build-your-own-docker-swarm-cluster-part-3#traefik-" >}}). For tools that don't support it natively, like databases, you'll find many exporters that will do the job for you.
+
 ### Prometheus install 💽

+I'll not use GlusterFS volume for storing Prometheus data, because :
+
+* 1 instance needed on the master
+* No critical data, it's just metrics
+* No need of backup, and it can be pretty huge
+
+First go to the `master-01` node settings in Portainer inside *Swarm Cluster overview*, and apply a new label that indicates that this node is the host of Prometheus data.
+
+![Prometheus host overview](portainer-host-overview.png)
+
+It's equivalent of doing :
+
+```sh
+export NODE_ID=$(docker info -f '{{.Swarm.NodeID}}')
+docker node update --label-add prometheus.data=true $NODE_ID
+```
+
+Then create a config file at `/etc/prometheus/prometheus.yml` in `master-01` node :
+
+```yml
+global:
+  scrape_interval: 5s
+
+scrape_configs:
+  - job_name: "prometheus"
+    static_configs:
+      - targets: ["localhost:9090"]
+
+  - job_name: "traefik"
+    static_configs:
+      - targets: ["traefik_traefik:8080"]
+```
+
+It consists on 2 scrapes job, use `targets` in order to indicate to Prometheus the `/metrics` endpoint locations. I configure `5s` as interval, that means Prometheus will scrape `/metrics` endpoints every 5 seconds.
+
+Finally create a `prometheus` stack in Portainer :
+
+```yml
+version: '3.7'
+
+services:
+
+  prometheus:
+    image: prom/prometheus
+    networks:
+      - private
+      - traefik_public
+    command:
+      - --config.file=/etc/prometheus/prometheus.yml
+      - --storage.tsdb.retention.size=5GB
+      - --storage.tsdb.retention.time=15d
+    volumes:
+      - /etc/hosts:/etc/hosts
+      - /etc/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
+      - data:/prometheus
+    deploy:
+      placement:
+        constraints:
+          - node.labels.prometheus.data == true
+      labels:
+        - traefik.enable=true
+        - traefik.http.routers.prometheus.middlewares=admin-ip,admin-auth
+        - traefik.http.services.prometheus.loadbalancer.server.port=9090
+
+networks:
+  private:
+  traefik_public:
+    external: true
+
+volumes:
+  data:
+```
+
+The `private` network will serve us later for exporters. Next config are useful in order to control the DB usage, as metrics can go up very quickly :
+
+| argument                    | description                 |
+| --------------------------- | --------------------------- |
+| storage.tsdb.retention.size | The max DB size             |
+| storage.tsdb.retention.time | The max data retention date |
+
+Deploy it and <https://prometheus.sw.okami101.io> should be available after few seconds. Use same traefik credentials for login.
+
+You should now have access to some metrics !
+
+![Prometheus graph](prometheus-graph.png)
+
+In *Status > Targets*, you should have 2 endpoints enabled, which correspond to above scrape config.
+
+![Prometheus targets](prometheus-targets.png)
+
 ### Nodes & Containers metrics with cAdvisor & Node exporter

 ## Visualization with Grafana 📈
--- a/content/posts/06-build-your-own-docker-swarm-cluster-part-5/portainer-host-overview.png
+++ b/content/posts/06-build-your-own-docker-swarm-cluster-part-5/portainer-host-overview.png
--- a/content/posts/06-build-your-own-docker-swarm-cluster-part-5/prometheus-graph.png
+++ b/content/posts/06-build-your-own-docker-swarm-cluster-part-5/prometheus-graph.png
--- a/content/posts/06-build-your-own-docker-swarm-cluster-part-5/prometheus-targets.png
+++ b/content/posts/06-build-your-own-docker-swarm-cluster-part-5/prometheus-targets.png