dev swarm post

2022-02-15 22:45:57 +01:00
parent 5e31236f4b
commit 2d7dc00a64
1 changed files with 55 additions and 12 deletions
--- a/content/posts/2022-02-13-build-your-own-docker-swarm-cluster.md
+++ b/content/posts/2022-02-13-build-your-own-docker-swarm-cluster.md
@ -27,6 +27,7 @@ So the prerequisites before continue :
 * Be comfortable with SSH terminal
 * Registered for a [Hetzner Cloud account](https://accounts.hetzner.com/signUp)
 * A custom domain, I'll use `example.org` here
+* A account to a transactional mail provider as mailgun, sendgrid, sendingblue, etc.

 {{< alert >}}
 You can of course apply this guide on any other cloud provider, but I doubt that you can achieve lower price.
@ -36,31 +37,36 @@ You can of course apply this guide on any other cloud provider, but I doubt that

 In the end of this multi-steps guide, you will have complete working production grade secured cluster, backup included, with optional monitoring and complete development CI/CD workflow.

-### The stateful part for data 💾
+### 1. Cluster initialization 🌍
+
+* Initial VPS setup for docker under Ubuntu 20.04 with proper Hetzner firewall configuration
+* `Docker Swarm` installation, **1 manager and 2 workers**
+* `Traefik`, a cloud native reverse proxy with automatic service discovery and SSL configuration
+* `Portainer` as simple GUI for containers management
+
+### 2. The stateful part for data 💾

 For all data critical part, I choose to use **1 dedicated VPS**. We will install :

+* `GlusterFS` as network filesystem, configured for cluster nodes
+* `Loki` with `Promtail` for centralized logs, fetched from data node and docker containers
 * `PostgreSQL` as main production database
-* `GlusterFS` as network filesystem
-* `MySQL`, `MongoDB`, `Redis` as optional storage tools
-* `Elasticsearch` that will be used for tracing storage (`Jaeger`)
-* `Loki` with `Promtail` for centralized logs
-* `Restic` for backups to external S3 compatible bucket
+* `MySQL` as additional secondary database (optional)
+
+Note as I will not setup this for **HA** (High Availability) here, as it's a complete another topic. So this data node will be our **SPF** (Single Point of Failure) with only one file system and DB.

 {{< alert >}}
 There are many debates about using databases as docker container, but I personally prefer use managed server for better control, local on-disk performance, central backup management and easier possibility of database clustering.  
 Note as on the Kubernetes world, run containerized databases becomes reality thanks to [powerful operators](https://github.com/zalando/postgres-operator) that provide easy clustering. The is obviously no such things on Docker Swarm 🙈
 {{< /alert >}}

-### The stateless part, aka cluster 🚆
+#### Data Backup (optional)

-#### Cluster initialization 🌍
+Because backup should be taken care from the beginning, I'll show you how to use `Restic` for simple backups to external S3 compatible bucket.

-* `Docker Swarm` installation, **1 manager and 2 workers**
-* `Traefik`, a cloud native reverse proxy with automatic service discovery and SSL configuration
-* `Portainer` as simple GUI for containers management
+#### Testing the cluster with some containerized tools ✅

-#### Testing with some initial containerized tools ✅
+We will use the main portainer GUI in order to install following tools :

 * `pgAdmin` and `phpMyAdmin` as web database managers (optional)
 * [`Diun`](https://crazymax.dev/diun/) (optional), very useful in order to be notified for all used images update inside your Swarm cluster
@ -70,6 +76,8 @@ Note as on the Kubernetes world, run containerized databases becomes reality tha

 * `Prometheus` as time series DB for monitoring, with many exporter (Node, PostgreSQL, MySQL Cadvisor, Traefik)
 * `Jaeger` as *tracing* tools
+  * We will use `Elasticsearch` as main data storage
+* `Traefik` configuration for metrics and trace
 * `Grafana` as GUI dashboard builder with many battery included dashboards
  * Monitoring all the cluster
  * Node, PostgreSQL and MySQL metrics
@ -93,6 +101,41 @@ Note as this cluster will be intended for developer user with complete self-host
 * `runner-01` : An additional worker dedicated to CI/CD pipelines execution
 * `data-01` : The critical data node, with attached and resizable volume for better flexibility

+{{< mermaid >}}
+flowchart TD
+subgraph manager-01
+traefik((Traefik))<-- Container Discovery -->docker[Docker API]
+end
+subgraph worker-01
+my-app-01((My App 01))
+my-app-02((My App 02))
+end
+subgraph runner-01
+runner((Drone CI runner))
+end
+subgraph data-01
+logs[Loki]
+postgresql[(PostgreSQL)]
+files[/GlusterFS/]
+mysql[(MySQL)]
+end
+manager-01 == As Worker Node ==> worker-01
+manager-01 == As Worker Node ==> runner-01
+traefik -. reverse proxy .-> my-app-01
+traefik -. reverse proxy .-> my-app-02
+my-app-01 -.-> postgresql
+my-app-02 -.-> mysql
+my-app-01 -.-> files
+my-app-02 -.-> files
+{{< /mermaid >}}
+
+Note as the hostnames correspond to a particular type of server, dedicated for one task specifically. Each type of node can be scale as you wish :
+
+* `manager-0x` For advanced resilient Swarm quorum
+* `worker-0x` : For better scaling production apps, the easiest to setup
+* `runner-0x` : More power for pipeline execution
+* `data-0x` : The hard part for data **HA**, with GlusterFS replications, DB clustering for PostgreSQL and MySQL, etc.
+
 {{< alert >}}
 For a simple production cluster, you can start with only `manager-01` and `data-01` as absolutely minimal start.  
 For a development perspective, you can skip `worker-01` and use `manager-01` for production running.