consistent double comma
This commit is contained in:
@ -12,7 +12,7 @@ Use GitOps workflow for building a production grade on-premise Kubernetes cluste
|
||||
|
||||
## The goal 🎯
|
||||
|
||||
This guide is mainly intended for any developers or some SRE who want to build a Kubernetes cluster that respect following conditions :
|
||||
This guide is mainly intended for any developers or some SRE who want to build a Kubernetes cluster that respect following conditions:
|
||||
|
||||
1. **On-Premise management** (The Hard Way), so no vendor lock in to any managed Kubernetes provider (KaaS/CaaS)
|
||||
2. Hosted on affordable VPS provider (**Hetzner**), with strong **Terraform support**, allowing **GitOps** principles
|
||||
@ -124,7 +124,7 @@ storage-03 --> db-streaming
|
||||
|
||||
As a HA Kubernetes cluster can be quickly expensive, a good cloud provider is an essential part.
|
||||
|
||||
After testing many providers, as Digital Ocean, Vultr, Linode, Civo, OVH, Scaleway, it seems like **Hetzner** is very well suited **in my opinion** :
|
||||
After testing many providers, as Digital Ocean, Vultr, Linode, Civo, OVH, Scaleway, it seems like **Hetzner** is very well suited **in my opinion**:
|
||||
|
||||
* Very competitive price for middle-range performance (plan only around **$6** for 2 CPU/4 GB for each node)
|
||||
* No frills, just the basics, VMs, block volumes, load balancer, firewall, and that's it
|
||||
@ -152,7 +152,7 @@ We will also need some expendable block volumes for our storage nodes. Let's sta
|
||||
|
||||
We targeted **€60/month** for a minimal working CI/CD cluster, so we are good !
|
||||
|
||||
You can also prefer to take **2 larger** cx31 worker nodes (**8 GB** RAM) instead of **3 smaller** ones, which [will optimize resource usage](https://learnk8s.io/kubernetes-node-size), so :
|
||||
You can also prefer to take **2 larger** cx31 worker nodes (**8 GB** RAM) instead of **3 smaller** ones, which [will optimize resource usage](https://learnk8s.io/kubernetes-node-size), so:
|
||||
|
||||
(5.39+**7**\*0.5+**5**\*4.85+**2**\*9.2+**2**\*0.88)\*1.2 = **€63.96** / month
|
||||
|
||||
|
@ -68,12 +68,12 @@ However, writing all terraform logic from scratch is a bit tedious, even more if
|
||||
|
||||
### Choosing K3s Terraform module
|
||||
|
||||
We have mainly 2 options :
|
||||
We have mainly 2 options:
|
||||
|
||||
* Using the strongest community driven module for Hetzner : [Kube Hetzner](https://registry.terraform.io/modules/kube-hetzner/kube-hetzner/hcloud/latest)
|
||||
* Using the strongest community driven module for Hetzner: [Kube Hetzner](https://registry.terraform.io/modules/kube-hetzner/kube-hetzner/hcloud/latest)
|
||||
* Write our own reusable module or using my [existing start-kit module](https://registry.terraform.io/modules/okami101/k3s)
|
||||
|
||||
Here are the pros and cons of each module :
|
||||
Here are the pros and cons of each module:
|
||||
|
||||
| | [Kube Hetzner](https://registry.terraform.io/modules/kube-hetzner/kube-hetzner/hcloud/latest) | [Okami101 K3s](https://registry.terraform.io/modules/okami101/k3s) |
|
||||
| ----------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
@ -89,13 +89,13 @@ Here are the pros and cons of each module :
|
||||
| **Security** | Needs an SSH private key because of local provisioners, and SSH port opened to every node | Require only public SSH key, minimized opened SSH ports to only controllers, use SSH jump from a controller to access any internal worker node |
|
||||
| **Reusability** | Vendor locked to Hetzner Cloud | Easy to adapt for a different cloud provider as long as it supports **cloud-config** (as 99% of them) |
|
||||
|
||||
So for resume, choose Kube Hetzner module if :
|
||||
So for resume, choose Kube Hetzner module if:
|
||||
|
||||
* You want to use an OS optimized for containers, but note as it takes more RAM usage than Debian-like distro (230 Mo VS 120Mo).
|
||||
* Strong community support is important for you
|
||||
* Need of [Hcloud Controller](https://github.com/hetznercloud/hcloud-cloud-controller-manager) functionalities from the ground up, giving support for **autoscaling** and **dynamic load balancing**
|
||||
|
||||
Choose the starter-kit module if :
|
||||
Choose the starter-kit module if:
|
||||
|
||||
* You want to use a more standard OS, as Debian or Ubuntu, which consume less RAM, managed by preinstalled Salt
|
||||
* You prefer to start with a simplistic module, without internal hacks, giving you a better understanding of the cluster setup step-by-step and more moveable to another cloud provider
|
||||
@ -106,7 +106,7 @@ For this guide, I'll consider using the starter kit as it's more suited for tuto
|
||||
|
||||
### 1st Terraform project
|
||||
|
||||
Let's initialize basic cluster setup. Create an empty folder (I name it `demo-kube-hcloud` here) for our terraform project, and create following `kube.tf` file :
|
||||
Let's initialize basic cluster setup. Create an empty folder (I name it `demo-kube-hcloud` here) for our terraform project, and create following `kube.tf` file:
|
||||
|
||||
{{< highlight host="demo-kube-hcloud" file="kube.tf" >}}
|
||||
|
||||
@ -234,7 +234,7 @@ I'm using a local backend for simplicity, but for teams sharing, you may use mor
|
||||
|
||||
Treat the Terraform state very carefully in secured place, as it's the only source of truth for your cluster. If leaked, consider the cluster as **compromised and you should active DRP (disaster recovery plan)**. The first vital action is at least to renew the Hetzner Cloud and S3 tokens immediately.
|
||||
|
||||
At any case, consider any leak of writeable Hetzner Cloud token as a **Game Over**. Indeed, even if the attacker has no direct access to existing servers, mainly because cluster SSH private key as well as kube config are not stored into Terraform state, he still has full control of infrastructure, and can do the following actions :
|
||||
At any case, consider any leak of writeable Hetzner Cloud token as a **Game Over**. Indeed, even if the attacker has no direct access to existing servers, mainly because cluster SSH private key as well as kube config are not stored into Terraform state, he still has full control of infrastructure, and can do the following actions:
|
||||
|
||||
1. Create new server to same cluster network with its own SSH access.
|
||||
2. Install a new K3s agent and connect it to the controllers thanks to the generated K3s token stored into Terraform state.
|
||||
@ -394,14 +394,14 @@ export TF_VAR_s3_secret_key="xxx"
|
||||
|
||||
#### Terraform apply
|
||||
|
||||
It's finally time to initialize the cluster :
|
||||
It's finally time to initialize the cluster:
|
||||
|
||||
```sh
|
||||
terraform init
|
||||
terraform apply
|
||||
```
|
||||
|
||||
Check the printed plan and confirm. The cluster creation will take about 1 minute. When finished following SSH configuration should appear :
|
||||
Check the printed plan and confirm. The cluster creation will take about 1 minute. When finished following SSH configuration should appear:
|
||||
|
||||
```sh
|
||||
Host kube
|
||||
@ -468,7 +468,7 @@ kube-worker-01 Ready <none> 152m v1.27.4+k3s1
|
||||
|
||||
#### Kubectl Aliases
|
||||
|
||||
As we'll use `kubectl` a lot, I highly encourage you to use aliases for better productivity :
|
||||
As we'll use `kubectl` a lot, I highly encourage you to use aliases for better productivity:
|
||||
|
||||
* <https://github.com/ahmetb/kubectl-aliases> for bash
|
||||
* <https://github.com/shanoor/kubectl-aliases-powershell> for Powershell
|
||||
|
@ -189,7 +189,7 @@ Now it's time to expose our cluster to the outside world. We'll use Traefik as i
|
||||
|
||||
### Traefik
|
||||
|
||||
Apply following file :
|
||||
Apply following file:
|
||||
|
||||
{{< highlight host="demo-kube-k3s" file="traefik.tf" >}}
|
||||
|
||||
@ -369,7 +369,7 @@ We'll use [DNS01 challenge](https://cert-manager.io/docs/configuration/acme/dns0
|
||||
You may use a DNS provider that is supported by cert-manager. Check the [list of supported providers](https://cert-manager.io/docs/configuration/acme/dns01/#supported-dns01-providers). But cert-manager is highly extensible, and you can easily add your own provider if needed with some efforts. Check [available contrib webhooks](https://cert-manager.io/docs/configuration/acme/dns01/#webhook).
|
||||
{{</ alert >}}
|
||||
|
||||
First prepare variables and set them accordingly :
|
||||
First prepare variables and set them accordingly:
|
||||
|
||||
{{< highlight host="demo-kube-k3s" file="main.tf" >}}
|
||||
|
||||
@ -489,9 +489,9 @@ Try `test.kube.rocks` to check certificate validity. If not valid, check the cer
|
||||
|
||||
Traefik dashboard is a nice tool to check all ingress and their status. Let's expose it with a simple ingress and protecting with IP whitelist and basic auth, which can be done with middlewares.
|
||||
|
||||
First the auth variables :
|
||||
First the auth variables:
|
||||
|
||||
First prepare variables and set them accordingly :
|
||||
First prepare variables and set them accordingly:
|
||||
|
||||
{{< highlight host="demo-kube-k3s" file="main.tf" >}}
|
||||
|
||||
@ -538,7 +538,7 @@ whitelisted_ips = ["82.82.82.82"]
|
||||
Note on encrypted_admin_password, we generate a bcrypt hash of the password compatible for HTTP basic auth and keep the original to avoid to regenerate it each time.
|
||||
{{</ alert >}}
|
||||
|
||||
Then apply the following Terraform code :
|
||||
Then apply the following Terraform code:
|
||||
|
||||
{{< highlight host="demo-kube-k3s" file="traefik.tf" >}}
|
||||
|
||||
|
@ -114,7 +114,7 @@ module "hcloud_kube" {
|
||||
|
||||
{{< /highlight >}}
|
||||
|
||||
SSH to both storage nodes to check if a 20GB volume is correctly mounted by `df -h` command. It should be like :
|
||||
SSH to both storage nodes to check if a 20GB volume is correctly mounted by `df -h` command. It should be like:
|
||||
|
||||
```txt
|
||||
Filesystem Size Used Avail Use% Mounted on
|
||||
|
@ -14,15 +14,15 @@ This is the **Part IV** of more global topic tutorial. [Back to guide summary]({
|
||||
|
||||
## Flux
|
||||
|
||||
In GitOps world, 2 tools are in lead for CD in k8s : Flux and ArgoCD. As Flux is CLI first and more lightweight, it's my personal goto. You may ask why don't continue with actual k8s Terraform project ?
|
||||
In GitOps world, 2 tools are in lead for CD in k8s: Flux and ArgoCD. As Flux is CLI first and more lightweight, it's my personal goto. You may ask why don't continue with actual k8s Terraform project ?
|
||||
|
||||
You already noted that by adding more and more Helm dependencies to terraform, the plan time is increasing, as well as the state file. So not very scalable.
|
||||
|
||||
It's the perfect moment to draw a clear line between **IaC** and **CD**. IaC is for infrastructure, CD is for application. So to resume our GitOps stack :
|
||||
It's the perfect moment to draw a clear line between **IaC** and **CD**. IaC is for infrastructure, CD is for application. So to resume our GitOps stack:
|
||||
|
||||
1. IaC for Hcloud cluster initialization (*the basement*) : **Terraform**
|
||||
2. IaC for cluster configuration (*the walls*) : **Helm** through **Terraform**
|
||||
3. CD for application deployment (*the furniture*) : **Flux**
|
||||
1. IaC for Hcloud cluster initialization (*the basement*): **Terraform**
|
||||
2. IaC for cluster configuration (*the walls*): **Helm** through **Terraform**
|
||||
3. CD for application deployment (*the furniture*): **Flux**
|
||||
|
||||
{{< alert >}}
|
||||
You can probably eliminate with some efforts the 2nd stack by using both `Kube-Hetzner`, which take care of ingress and storage, and using Flux directly for the remaining helms like database cluster. Or maybe you can also add custom helms to `Kube-Hetzner` ?
|
||||
|
@ -152,7 +152,7 @@ resource "helm_release" "kube_prometheus_stack" {
|
||||
|
||||
The application is deployed in `monitoring` namespace. It can takes a few minutes to be fully up and running. You can check the status with `kgpo -n monitoring`.
|
||||
|
||||
Important notes :
|
||||
Important notes:
|
||||
|
||||
* We set a retention of **15 days** and **5GB** of storage for Prometheus. Set this according to your needs.
|
||||
* We allow `serviceMonitorSelector` and `podMonitorSelector` for scrapping monitor CRDs from all namespaces.
|
||||
@ -426,7 +426,7 @@ Grafana should be deploying and migrate database successfully. Let's log in imme
|
||||
|
||||
### Native dashboards
|
||||
|
||||
If you go to `https://grafana.kube.rocks/dashboards`, you should see a many dashboards available that should already perfectly work, giving you a complete vision of :
|
||||
If you go to `https://grafana.kube.rocks/dashboards`, you should see a many dashboards available that should already perfectly work, giving you a complete vision of:
|
||||
|
||||
* Some core components of K8s, like coredns, kube api server, all kubelets
|
||||
* Detail of pods, namespace, workloads
|
||||
|
@ -378,7 +378,7 @@ resource "helm_release" "traefik" {
|
||||
|
||||
{{< /highlight >}}
|
||||
|
||||
And finally, the route ingress :
|
||||
And finally, the route ingress:
|
||||
|
||||
{{< highlight host="demo-kube-k3s" file="gitea.tf" >}}
|
||||
|
||||
|
@ -415,7 +415,7 @@ Click on one specific trace to get details. You can go through HTTP requests, EF
|
||||
|
||||
It would be nice to have directly access to trace from logs through Loki search, as it's clearly a more seamless way than searching inside Tempo.
|
||||
|
||||
For that we need to do 2 things :
|
||||
For that we need to do 2 things:
|
||||
|
||||
* Add the `TraceId` to logs in order to correlate trace with log. In ASP.NET Core, a `TraceId` correspond to a unique request, allowing isolation analyze for each request.
|
||||
* Create a link in Grafana from the generated `TraceId` inside log and the detail Tempo view trace.
|
||||
|
Reference in New Issue
Block a user