uniformize
This commit is contained in:
@ -217,7 +217,7 @@ output "ssh_config" {
|
||||
}
|
||||
```
|
||||
|
||||
{{</ highlight >}}
|
||||
{{< /highlight >}}
|
||||
|
||||
#### Explanation
|
||||
|
||||
@ -267,7 +267,7 @@ Why not `debian-12` ? Because it's sadly not yet supported by [Salt project](htt
|
||||
|
||||
{{< alert >}}
|
||||
`nfs-common` package is required for Longhorn in order to support RWX volumes.
|
||||
{{</ alert >}}
|
||||
{{< /alert >}}
|
||||
|
||||
`cluster_name` is the node's name prefix and will have the format `{cluster_name}-{pool_name}-{index}`, for example `kube-storage-01`. `cluster_user` is the username UID 1000 for SSH access with sudo rights. `root` user is disabled for remote access security reasons.
|
||||
|
||||
@ -398,7 +398,7 @@ s3_access_key = "xxx"
|
||||
s3_secret_key = "xxx"
|
||||
```
|
||||
|
||||
{{</ highlight >}}
|
||||
{{< /highlight >}}
|
||||
|
||||
{{</ tab >}}
|
||||
{{< tab tabName="Environment variables" >}}
|
||||
@ -463,7 +463,7 @@ Merge above SSH config into your `~/.ssh/config` file, then test the connection
|
||||
|
||||
{{< alert >}}
|
||||
If you get "Connection refused", it's probably because the server is still on cloud-init phase. Wait a few minutes and try again. Be sure to have the same public IPs as the one you whitelisted in the Terraform variables. You can edit them and reapply the Terraform configuration at any moment.
|
||||
{{</ alert >}}
|
||||
{{< /alert >}}
|
||||
|
||||
Before using K3s, let's enable Salt for OS management by taping `sudo salt-key -A -y`. This will accept all pending keys, and allow Salt to connect to all nodes. To upgrade all nodes at one, just type `sudo salt '*' pkg.upgrade`.
|
||||
|
||||
@ -478,7 +478,7 @@ From the controller, copy `/etc/rancher/k3s/k3s.yaml` on your machine located ou
|
||||
{{< alert >}}
|
||||
If `~/.kube/config` already existing, you have to properly [merging the config inside it](https://able8.medium.com/how-to-merge-multiple-kubeconfig-files-into-one-36fc987c2e2f). You can use `kubectl config view --flatten` for that.
|
||||
Then use `kubectl config use-context kube` for switching to your new cluster.
|
||||
{{</ alert >}}
|
||||
{{< /alert >}}
|
||||
|
||||
Type `kubectl get nodes` and you should see the 2 nodes of your cluster in **Ready** state.
|
||||
|
||||
@ -514,7 +514,7 @@ agent_nodepools = [
|
||||
]
|
||||
```
|
||||
|
||||
{{</ highlight >}}
|
||||
{{< /highlight >}}
|
||||
|
||||
Then apply the Terraform configuration again. After few minutes, you should see 2 new nodes in **Ready** state.
|
||||
|
||||
@ -528,7 +528,7 @@ kube-worker-03 Ready <none> 25s v1.27.4+k3s1
|
||||
|
||||
{{< alert >}}
|
||||
You'll have to use `sudo salt-key -A -y` each time you'll add a new node to the cluster for global OS management.
|
||||
{{</ alert >}}
|
||||
{{< /alert >}}
|
||||
|
||||
#### Deleting workers
|
||||
|
||||
@ -538,7 +538,7 @@ To finalize the deletion, delete the node from the cluster with `krm no kube-wor
|
||||
|
||||
{{< alert >}}
|
||||
If node have some workloads running, you'll have to consider a proper [draining](https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/) before deleting it.
|
||||
{{</ alert >}}
|
||||
{{< /alert >}}
|
||||
|
||||
## 1st check ✅
|
||||
|
||||
|
@ -25,7 +25,7 @@ terraform {
|
||||
}
|
||||
```
|
||||
|
||||
{{</ highlight >}}
|
||||
{{< /highlight >}}
|
||||
|
||||
Let's begin with automatic upgrades management.
|
||||
|
||||
@ -75,7 +75,7 @@ resource "helm_release" "kubereboot" {
|
||||
}
|
||||
```
|
||||
|
||||
{{</ highlight >}}
|
||||
{{< /highlight >}}
|
||||
|
||||
For all `helm_release` resource you'll see from this guide, you may check the last chart version available. Example for `kured`:
|
||||
|
||||
@ -100,7 +100,7 @@ However, as Terraform doesn't offer a proper way to apply a remote multi-documen
|
||||
|
||||
{{< alert >}}
|
||||
Don't push yourself get fully 100% GitOps everywhere if the remedy give far more code complexity. Sometimes a simple documentation of manual steps in README is better.
|
||||
{{</ alert >}}
|
||||
{{< /alert >}}
|
||||
|
||||
```sh
|
||||
# installing system-upgrade-controller
|
||||
@ -187,11 +187,11 @@ resource "kubernetes_manifest" "agent_plan" {
|
||||
}
|
||||
```
|
||||
|
||||
{{</ highlight >}}
|
||||
{{< /highlight >}}
|
||||
|
||||
{{< alert >}}
|
||||
You may set the same channel as previous step for hcloud cluster creation.
|
||||
{{</ alert >}}
|
||||
{{< /alert >}}
|
||||
|
||||
## External access
|
||||
|
||||
@ -259,7 +259,7 @@ resource "helm_release" "traefik" {
|
||||
}
|
||||
```
|
||||
|
||||
{{</ highlight >}}
|
||||
{{< /highlight >}}
|
||||
|
||||
`ports.web.redirectTo` will redirect all HTTP traffic to HTTPS.
|
||||
|
||||
@ -317,14 +317,14 @@ resource "hcloud_load_balancer_service" "https_service" {
|
||||
}
|
||||
```
|
||||
|
||||
{{</ highlight >}}
|
||||
{{< /highlight >}}
|
||||
|
||||
Use `hcloud load-balancer-type list` to get the list of available load balancer types.
|
||||
|
||||
{{< alert >}}
|
||||
Don't forget to add `hcloud_load_balancer_service` resource for each service (aka port) you want to serve.
|
||||
We use `tcp` protocol as Traefik will handle SSL termination. Set `proxyprotocol` to true to allow Traefik to get real IP of clients.
|
||||
{{</ alert >}}
|
||||
{{< /alert >}}
|
||||
|
||||
One applied, use `hcloud load-balancer list` to get the public IP of the load balancer and try to curl it. You should be properly redirected to HTTPS and have certificate error. It's time to get SSL certificates.
|
||||
|
||||
@ -362,12 +362,12 @@ resource "helm_release" "cert_manager" {
|
||||
}
|
||||
```
|
||||
|
||||
{{</ highlight >}}
|
||||
{{< /highlight >}}
|
||||
|
||||
{{< alert >}}
|
||||
You can use `installCRDs` option to install CRDs automatically. But uninstall cert-manager will delete all associated resources including generated certificates. That's why I generally prefer to install CRDs manually.
|
||||
As always we enable `prometheus.servicemonitor.enabled` to allow Prometheus to scrape cert-manager metrics.
|
||||
{{</ alert >}}
|
||||
{{< /alert >}}
|
||||
|
||||
All should be ok with `kg deploy -n cert-manager`.
|
||||
|
||||
@ -377,7 +377,7 @@ We'll use [DNS01 challenge](https://cert-manager.io/docs/configuration/acme/dns0
|
||||
|
||||
{{< alert >}}
|
||||
You may use a DNS provider supported by cert-manager. Check the [list of supported providers](https://cert-manager.io/docs/configuration/acme/dns01/#supported-dns01-providers). As cert-manager is highly extensible, you can easily create your own provider with some efforts. Check [available contrib webhooks](https://cert-manager.io/docs/configuration/acme/dns01/#webhook).
|
||||
{{</ alert >}}
|
||||
{{< /alert >}}
|
||||
|
||||
First prepare variables and set them accordingly:
|
||||
|
||||
@ -398,7 +398,7 @@ variable "dns_api_token" {
|
||||
}
|
||||
```
|
||||
|
||||
{{</ highlight >}}
|
||||
{{< /highlight >}}
|
||||
|
||||
{{< highlight host="demo-kube-k3s" file="terraform.tfvars" >}}
|
||||
|
||||
@ -408,7 +408,7 @@ domain = "kube.rocks"
|
||||
dns_api_token = "xxx"
|
||||
```
|
||||
|
||||
{{</ highlight >}}
|
||||
{{< /highlight >}}
|
||||
|
||||
Then we need to create a default `Certificate` k8s resource associated to a valid `ClusterIssuer` resource that will manage its generation. Apply the following Terraform code for issuing the new wildcard certificate for your domain.
|
||||
|
||||
@ -484,12 +484,12 @@ resource "kubernetes_manifest" "tls_certificate" {
|
||||
}
|
||||
```
|
||||
|
||||
{{</ highlight >}}
|
||||
{{< /highlight >}}
|
||||
|
||||
{{< alert >}}
|
||||
You can set `acme.privateKeySecretRef.name` to **letsencrypt-staging** for testing purpose and avoid wasting LE quota limit.
|
||||
Set `privateKey.rotationPolicy` to `Always` to ensure that the certificate will be [renewed automatically](https://cert-manager.io/docs/usage/certificate/) 30 days before expires without downtime.
|
||||
{{</ alert >}}
|
||||
{{< /alert >}}
|
||||
|
||||
In the meantime, go to your DNS provider and add a new `*.kube.rocks` entry pointing to the load balancer IP.
|
||||
|
||||
@ -530,7 +530,7 @@ resource "null_resource" "encrypted_admin_password" {
|
||||
}
|
||||
```
|
||||
|
||||
{{</ highlight >}}
|
||||
{{< /highlight >}}
|
||||
|
||||
{{< highlight host="demo-kube-k3s" file="terraform.tfvars" >}}
|
||||
|
||||
@ -540,11 +540,11 @@ http_password = "xxx"
|
||||
whitelisted_ips = ["82.82.82.82"]
|
||||
```
|
||||
|
||||
{{</ highlight >}}
|
||||
{{< /highlight >}}
|
||||
|
||||
{{< alert >}}
|
||||
Note on `encrypted_admin_password`, we generate a bcrypt hash of the password compatible for HTTP basic auth and keep the original to avoid to regenerate it each time.
|
||||
{{</ alert >}}
|
||||
{{< /alert >}}
|
||||
|
||||
Then apply the following Terraform code:
|
||||
|
||||
@ -619,7 +619,7 @@ resource "kubernetes_manifest" "traefik_middleware_ip" {
|
||||
}
|
||||
```
|
||||
|
||||
{{</ highlight >}}
|
||||
{{< /highlight >}}
|
||||
|
||||
Now go to `https://traefik.kube.rocks` and you should be asked for credentials. After login, you should see the dashboard.
|
||||
|
||||
@ -651,7 +651,7 @@ resource "kubernetes_manifest" "traefik_middleware_ip" {
|
||||
}
|
||||
```
|
||||
|
||||
{{</ highlight >}}
|
||||
{{< /highlight >}}
|
||||
|
||||
In the case of Cloudflare, you may need also to trust the [Cloudflare IP ranges](https://www.cloudflare.com/ips-v4) in addition to Hetzner load balancer. Just set `ports.websecure.forwardedHeaders.trustedIPs` and `ports.websecure.proxyProtocol.trustedIPs` accordingly.
|
||||
|
||||
@ -664,7 +664,7 @@ variable "cloudflare_ips" {
|
||||
}
|
||||
```
|
||||
|
||||
{{</ highlight >}}
|
||||
{{< /highlight >}}
|
||||
|
||||
{{< highlight host="demo-kube-k3s" file="traefik.tf" >}}
|
||||
|
||||
@ -688,7 +688,7 @@ resource "helm_release" "traefik" {
|
||||
}
|
||||
```
|
||||
|
||||
{{</ highlight >}}
|
||||
{{< /highlight >}}
|
||||
|
||||
Or for testing purpose set `ports.websecure.forwardedHeaders.insecure` and `ports.websecure.proxyProtocol.insecure` to true.
|
||||
|
||||
|
@ -19,7 +19,7 @@ In Kubernetes world, the most difficult while essential part is probably the sto
|
||||
If you are not familiar with Kubernetes storage, you must at least be aware of pros and cons of `RWO` and `RWX` volumes when creating `PVC`.
|
||||
In general `RWO` is more performant, but only one pod can mount it, while `RWX` is slower, but allow sharing between multiple pods.
|
||||
`RWO` is a single node volume, and `RWX` is a shared volume between multiple nodes.
|
||||
{{</ alert >}}
|
||||
{{< /alert >}}
|
||||
|
||||
`K3s` comes with a built-in `local-path` provisioner, which is the most performant `RWO` solution by directly using local NVMe SSD. But it's not resilient neither scalable. I think it's a good solution for what you consider as not critical data.
|
||||
|
||||
@ -126,7 +126,7 @@ The volume is of course automatically mounted on each node reboot, it's done via
|
||||
|
||||
{{< alert >}}
|
||||
Note as if you set volume in same time as node pool creation, Hetzner doesn't seem to automatically mount the volume. So it's preferable to create the node pool first, then add the volume as soon as the node in ready state. You can always detach / re-attach volumes manually through UI, which will force a proper remount.
|
||||
{{</ alert >}}
|
||||
{{< /alert >}}
|
||||
|
||||
### Longhorn variables
|
||||
|
||||
@ -254,7 +254,7 @@ resource "helm_release" "longhorn" {
|
||||
Set both `persistence.defaultClassReplicaCount` (used for Kubernetes configuration in longhorn storage class) and `defaultSettings.defaultReplicaCount` (for volumes created from the UI) to 2 as we have 2 storage nodes.
|
||||
The toleration is required to allow Longhorn pods (managers and drivers) to be scheduled on storage nodes in addition to workers.
|
||||
Note as we need to have longhorn deployed on workers too, otherwise pods scheduled on these nodes can't be attached to longhorn volumes.
|
||||
{{</ alert >}}
|
||||
{{< /alert >}}
|
||||
|
||||
Use `kgpo -n longhorn-system -o wide` to check that Longhorn pods are correctly running on storage nodes as well as worker nodes. You should have `instance-manager` deployed on each node.
|
||||
|
||||
@ -342,7 +342,7 @@ resource "kubernetes_manifest" "longhorn_ingress" {
|
||||
{{< alert >}}
|
||||
It's vital that you have at least IP and AUTH middlewares with a strong password for Longhorn UI access, as its concern the most critical part of cluster.
|
||||
Of course, you can skip this ingress and directly use `kpf svc/longhorn-frontend -n longhorn-system 8000:80` to access Longhorn UI securely.
|
||||
{{</ alert >}}
|
||||
{{< /alert >}}
|
||||
|
||||
### Nodes and volumes configuration
|
||||
|
||||
@ -576,11 +576,11 @@ resource "helm_release" "postgresql" {
|
||||
}
|
||||
```
|
||||
|
||||
{{</ highlight >}}
|
||||
{{< /highlight >}}
|
||||
|
||||
{{< alert >}}
|
||||
Don't forget to use fast storage by setting `primary.persistence.storageClass` and `readReplicas.persistence.storageClass` accordingly.
|
||||
{{</ alert >}}
|
||||
{{< /alert >}}
|
||||
|
||||
Now check that PostgreSQL pods are correctly running on storage nodes with `kgpo -n postgres -o wide`.
|
||||
|
||||
|
Reference in New Issue
Block a user