20 KiB
title, date, description, tags
title | date | description | tags | ||||||
---|---|---|---|---|---|---|---|---|---|
A beautiful GitOps day V - Monitoring and Logging Stack | 2023-08-23 | Follow this opinionated guide as starter-kit for your own Kubernetes platform... |
|
{{< lead >}} Use GitOps workflow for building a production grade on-premise Kubernetes cluster on cheap VPS provider, with complete CI/CD 🎉 {{< /lead >}}
This is the Part V of more global topic tutorial. [Back to guide summary]({{< ref "/posts/10-a-beautiful-gitops-day" >}}) for intro.
Monitoring
Monitoring is a critical part of any production grade platform. It allows you to be proactive and react before your users are impacted. It also helps get a quick visualization of cluster architecture and current usage.
Monitoring node pool
As well as storage pool, creating a dedicated node pool for monitoring stack is a good practice in order to scale it separately from the apps.
You now have a good understanding of how to create a node pool, so apply next configuration from our 1st Terraform project:
{{< highlight host="demo-kube-hcloud" file="kube.tf" >}}
module "hcloud_kube" {
//...
agent_nodepools = [
//...
{
name = "monitor"
server_type = "cx21"
location = "nbg1"
count = 1
private_interface = "ens10"
labels = [
"node.kubernetes.io/server-usage=monitor"
]
taints = [
"node-role.kubernetes.io/monitor:NoSchedule"
]
}
]
}
{{< /highlight >}}
Prometheus Stack
When using k8s, the standard de facto is to install Prometheus stack. It includes all necessary CRDs and components for a proper monitoring stack.
You have 2 choices to install it, are we using Flux or Terraform ? Flux include a full documentation of how to install it with.
But remember previous chapter with the house analogies. I personally consider monitoring as part of my infrastructure. And I prefer to keep all my infrastructure configuration in Terraform, and only use Flux for apps. Moreover, the Prometheus stack is a pretty big Helm chart, and upgrading it can be a bit tricky. So I prefer to have a full control of it with Terraform.
Go back to 2nd Terraform project and let's apply this pretty big boy:
{{< highlight host="demo-kube-k3s" file="monitoring.tf" >}}
resource "kubernetes_namespace_v1" "monitoring" {
metadata {
name = "monitoring"
}
}
resource "helm_release" "kube_prometheus_stack" {
chart = "kube-prometheus-stack"
version = "49.2.0"
repository = "https://prometheus-community.github.io/helm-charts"
name = "kube-prometheus-stack"
namespace = kubernetes_namespace_v1.monitoring.metadata[0].name
set {
name = "prometheus.prometheusSpec.retention"
value = "15d"
}
set {
name = "prometheus.prometheusSpec.retentionSize"
value = "5GB"
}
set {
name = "prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues"
value = "false"
}
set {
name = "prometheus.prometheusSpec.podMonitorSelectorNilUsesHelmValues"
value = "false"
}
set {
name = "prometheus.prometheusSpec.enableRemoteWriteReceiver"
value = "true"
}
set {
name = "prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.accessModes[0]"
value = "ReadWriteOnce"
}
set {
name = "prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage"
value = "8Gi"
}
set {
name = "prometheus.prometheusSpec.tolerations[0].key"
value = "node-role.kubernetes.io/storage"
}
set {
name = "prometheus.prometheusSpec.tolerations[0].operator"
value = "Exists"
}
set {
name = "prometheus.prometheusSpec.nodeSelector.node\\.kubernetes\\.io/server-usage"
value = "monitor"
}
set {
name = "alertmanager.enabled"
value = "false"
}
set {
name = "grafana.enabled"
value = "false"
}
set {
name = "grafana.forceDeployDatasources"
value = "true"
}
set {
name = "grafana.forceDeployDashboards"
value = "true"
}
}
{{< /highlight >}}
The application is deployed under monitoring
namespace. It takes few minutes to be fully up and running. You can check the status with kgpo -n monitoring
.
Important notes:
- We set a retention of 15 days and 5 GB of storage for Prometheus. Set this according to your needs.
- We allow
serviceMonitorSelector
andpodMonitorSelector
for scrapping monitor CRDs from all namespaces. - We set
enableRemoteWriteReceiver
to allow remote write to databases for advanced specific usage, as by default Prometheus works with pull model on its own. - As we don't set any storage class, the default one will be used, which is
local-path
when using K3s. If you want to use longhorn instead and benefit of automatic monitoring backup, you can set it with...volumeClaimTemplate.spec.storageClassName
. But don't forget to deploy Longhorn manager by adding monitor toleration. - As it's a huge chart, I want to minimize dependencies by disabling Grafana, as I prefer manage it separately. However, in this case we may set
grafana.forceDeployDatasources
andgrafana.forceDeployDashboards
totrue
in order to benefit of all included Kubernetes dashboards and automatic Prometheus datasource injection, and deploy them to config maps that can be used for next Grafana install by provisioning.
{{< alert >}}
As Terraform plan become slower and slower, you can directly apply one single resource by using target
option. For example for applying only Prometheus stack, use terraform apply -target=helm_release.kube_prometheus_stack
. It will save you a lot of time for testing.
{{< /alert >}}
And finally the ingress for external access:
{{< highlight host="demo-kube-k3s" file="monitoring.tf" >}}
resource "kubernetes_manifest" "prometheus_ingress" {
manifest = {
apiVersion = "traefik.io/v1alpha1"
kind = "IngressRoute"
metadata = {
name = "prometheus"
namespace = kubernetes_namespace_v1.monitoring.metadata[0].name
}
spec = {
entryPoints = ["websecure"]
routes = [
{
match = "Host(`prometheus.${var.domain}`)"
kind = "Rule"
middlewares = [
{
name = "middleware-ip"
namespace = "traefik"
},
{
name = "middleware-auth"
namespace = "traefik"
}
]
services = [
{
name = "prometheus-operated"
port = "http-web"
}
]
}
]
}
}
}
{{< /highlight >}}
No go to prometheus.kube.rocks
, after login you should access the Prometheus UI. Check under /targets
that all targets are up and running. In previous chapters, because we have enabled monitoring for all our apps supporting metrics, you should see following available targets:
- 1 instance of Traefik
- 1 instance of cert-manager
- 1 instance of each PostgreSQL primary and read
- 2 instances of Redis
- 5 instances of Longhorn manager
This is exactly how it works, the ServiceMonitor
custom resource is responsible to discover and centralize all metrics for prometheus, allowing automatic discovery without touch the Prometheus config. Use kg smon -A
to list them all.
Monitoring Flux
There is one missing however, let's add monitoring for flux. Go back to flux project and push following manifests:
{{< highlight host="demo-kube-flux" file="clusters/demo/flux-add-ons/flux-monitoring.yaml" >}}
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: flux-monitoring
namespace: flux-system
spec:
interval: 30m0s
ref:
branch: main
url: https://github.com/fluxcd/flux2-monitoring-example
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: monitoring-config
namespace: flux-system
spec:
interval: 1h0m0s
path: ./monitoring/configs
prune: true
sourceRef:
kind: GitRepository
name: flux-monitoring
{{< /highlight >}}
The spec.path
under Kustomization
tells Flux to scrape remote monitoring manifests, avoiding us to write all of them manually. It includes the PodMonitor
as well as Grafana dashboards.
After some minutes, flux should be appearing in Prometheus targets.
Grafana
We have the basement of our monitoring stack, it's time to get a UI to visualize all these metrics. Grafana is the most popular tool for that, and it's also available as Helm chart. Prepare some variables:
{{< highlight host="demo-kube-k3s" file="main.tf" >}}
variable "smtp_host" {
sensitive = true
}
variable "smtp_port" {
type = string
}
variable "smtp_user" {
type = string
sensitive = true
}
variable "smtp_password" {
type = string
sensitive = true
}
variable "grafana_db_password" {
type = string
sensitive = true
}
{{< /highlight >}}
Create grafana
database through pgAdmin with same user and according grafana_db_password
.
{{< highlight host="demo-kube-k3s" file="terraform.tfvars" >}}
smtp_host = "smtp.mailgun.org"
smtp_port = "587"
smtp_user = "xxx"
smtp_password = "xxx"
{{< /highlight >}}
Apply next configuration to Terraform project:
{{< highlight host="demo-kube-k3s" file="grafana.tf" >}}
resource "helm_release" "grafana" {
chart = "grafana"
version = "6.58.9"
repository = "https://grafana.github.io/helm-charts"
name = "grafana"
namespace = kubernetes_namespace_v1.monitoring.metadata[0].name
set {
name = "serviceMonitor.enabled"
value = "true"
}
set {
name = "sidecar.datasources.enabled"
value = "true"
}
set {
name = "sidecar.dashboards.enabled"
value = "true"
}
set {
name = "env.GF_SERVER_DOMAIN"
value = var.domain
}
set {
name = "env.GF_SERVER_ROOT_URL"
value = "https://grafana.${var.domain}"
}
set {
name = "env.GF_SMTP_ENABLED"
value = "true"
}
set {
name = "env.GF_SMTP_HOST"
value = "${var.smtp_host}:${var.smtp_port}"
}
set {
name = "env.GF_SMTP_USER"
value = var.smtp_user
}
set {
name = "env.GF_SMTP_PASSWORD"
value = var.smtp_password
}
set {
name = "env.GF_SMTP_FROM_ADDRESS"
value = "grafana@${var.domain}"
}
set {
name = "env.GF_DATABASE_TYPE"
value = "postgres"
}
set {
name = "env.GF_DATABASE_HOST"
value = "postgresql-primary.postgres"
}
set {
name = "env.GF_DATABASE_NAME"
value = "grafana"
}
set {
name = "env.GF_DATABASE_USER"
value = "grafana"
}
set {
name = "env.GF_DATABASE_PASSWORD"
value = var.grafana_db_password
}
}
resource "kubernetes_manifest" "grafana_ingress" {
manifest = {
apiVersion = "traefik.io/v1alpha1"
kind = "IngressRoute"
metadata = {
name = "grafana"
namespace = kubernetes_namespace_v1.monitoring.metadata[0].name
}
spec = {
entryPoints = ["websecure"]
routes = [
{
match = "Host(`grafana.${var.domain}`)"
kind = "Rule"
services = [
{
name = "grafana"
port = "service"
}
]
}
]
}
}
}
{{< /highlight >}}
We enable both data source and dashboard sidecars by setting sidecar.datasources.enabled
and sidecar.dashboards.enabled
. These sidecars will automatically inject all dashboards and data sources from ConfigMap
, like those provided by Prometheus stack and Flux. serviceMonitor.enabled
will create a ServiceMonitor
for Prometheus to scrape Grafana metrics.
Grafana should be deploying and migrate database successfully. Let's log in immediately after in https://grafana.kube.rocks/login
with admin account. You can get the password with kg secret -n monitoring grafana -o jsonpath='{.data.admin-password}' | base64 -d
.
Native dashboards
If you go to https://grafana.kube.rocks/dashboards
, you should see a many dashboards available that should already perfectly work, giving you a complete vision of:
- Some core components of K8s, like coredns, kube api server, all kubelets
- Detail of pods, namespace, workloads
- Nodes thanks to Node exporter
- Prometheus and Grafana itself stats
- Flux stats
Prometheus
Nodes
Cluster
Kube components
Flux
Additional dashboards
You can easily import some additional dashboards by importing them from Grafana marketplace or include them in ConfigMap
for automatic provisioning.
Traefik
cert-manager
Longhorn
PostgreSQL
Redis
Other core components
Some other core components like etcd, scheduler, proxy, and controller manager need to have metrics enabled to be scraped. See K3s docs or this issue.
From Terraform Hcloud project, use control_planes_custom_config
for expose all remaining metrics endpoint:
{{< highlight host="demo-kube-hcloud" file="kube.tf" >}}
module "hcloud_kube" {
//...
control_planes_custom_config = {
etcd-expose-metrics = true,
kube-scheduler-arg = "bind-address=0.0.0.0",
kube-controller-manager-arg = "bind-address=0.0.0.0",
kube-proxy-arg = "metrics-bind-address=0.0.0.0",
}
//...
}
{{< /highlight >}}
{{< alert >}}
As above config applies only at cluster initialization, you may change directly /etc/rancher/k3s/config.yaml
instead and restart K3s server.
{{< /alert >}}
Logging
Last but not least, we need to add a logging stack. The most popular one is Elastic Stack, but it's very resource intensive. A more lightweight option is to use Loki, also part of Grafana Labs.
In order to work on scalable mode, we need to have a S3 storage backend. We will reuse same S3 compatible storage as longhorn backup here, but it's recommended to use a separate bucket and credentials.
Loki
Let's install it now:
{{< highlight host="demo-kube-k3s" file="logging.tf" >}}
resource "kubernetes_namespace_v1" "logging" {
metadata {
name = "logging"
}
}
resource "helm_release" "loki" {
chart = "loki"
version = "5.15.0"
repository = "https://grafana.github.io/helm-charts"
name = "loki"
namespace = kubernetes_namespace_v1.logging.metadata[0].name
set {
name = "loki.auth_enabled"
value = "false"
}
set {
name = "loki.compactor.retention_enabled"
value = "true"
}
set {
name = "loki.limits_config.retention_period"
value = "24h"
}
set {
name = "loki.storage.bucketNames.chunks"
value = var.s3_bucket
}
set {
name = "loki.storage.bucketNames.ruler"
value = var.s3_bucket
}
set {
name = "loki.storage.bucketNames.admin"
value = var.s3_bucket
}
set {
name = "loki.storage.s3.endpoint"
value = var.s3_endpoint
}
set {
name = "loki.storage.s3.region"
value = var.s3_region
}
set {
name = "loki.storage.s3.accessKeyId"
value = var.s3_access_key
}
set {
name = "loki.storage.s3.secretAccessKey"
value = var.s3_secret_key
}
set {
name = "loki.commonConfig.replication_factor"
value = "1"
}
set {
name = "read.replicas"
value = "1"
}
set {
name = "backend.replicas"
value = "1"
}
set {
name = "write.replicas"
value = "2"
}
set {
name = "write.tolerations[0].key"
value = "node-role.kubernetes.io/storage"
}
set {
name = "write.tolerations[0].effect"
value = "NoSchedule"
}
set {
name = "write.nodeSelector.node-role\\.kubernetes\\.io/storage"
type = "string"
value = "true"
}
set {
name = "monitoring.dashboards.namespace"
value = kubernetes_namespace_v1.monitoring.metadata[0].name
}
set {
name = "monitoring.selfMonitoring.enabled"
value = "false"
}
set {
name = "monitoring.selfMonitoring.grafanaAgent.installOperator"
value = "false"
}
set {
name = "monitoring.lokiCanary.enabled"
value = "false"
}
set {
name = "test.enabled"
value = "false"
}
}
{{< /highlight >}}
Use loki.limits_config.retention_period
to set a maximum period retention. You need to set at least 2 for write.replicas
or you'll get this 500 API error "too many unhealthy instances in the ring". As we force them to be deployed on storage nodes, be sure to have 2 storage nodes.
Promtail
Okay so Loki is running but not fed, for that we'll deploy Promtail, which is a log collector that will be deployed on each node and collect logs from all pods and send them to Loki.
{{< highlight host="demo-kube-k3s" file="logging.tf" >}}
resource "helm_release" "promtail" {
chart = "promtail"
version = "6.15.0"
repository = "https://grafana.github.io/helm-charts"
name = "promtail"
namespace = kubernetes_namespace_v1.logging.metadata[0].name
set {
name = "tolerations[0].effect"
value = "NoSchedule"
}
set {
name = "tolerations[0].operator"
value = "Exists"
}
set {
name = "serviceMonitor.enabled"
value = "true"
}
}
{{< /highlight >}}
Ha, finally a simple Helm chart ! Seems too good to be true. We just have to add generic tolerations
in order to deploy Promtail DaemonSet
on every node for proper log scrapping.
Loki data source
Because we are GitOps, we want to have all Loki dashboards and data sources automatically configured. It's already done for dashboards, but we need to add a data source.
Let's apply next Terraform resource:
{{< highlight host="demo-kube-k3s" file="logging.tf" >}}
resource "kubernetes_config_map_v1" "loki_grafana_datasource" {
metadata {
name = "loki-grafana-datasource"
namespace = kubernetes_namespace_v1.monitoring.metadata[0].name
labels = {
grafana_datasource = "1"
}
}
data = {
"datasource.yaml" = <<EOF
apiVersion: 1
datasources:
- name: Loki
type: loki
uid: loki
url: http://loki-gateway.logging/
access: proxy
EOF
}
}
{{< /highlight >}}
Now go to https://grafana.kube.rocks/connections/datasources/edit/loki
and ensure that Loki respond correctly by click on Test.
Go can now admire logs in Loki UI at https://grafana.kube.rocks/explore
!
Loki dashboards
We have nothing more to do, all dashboards are already provided by Loki Helm chart.
5th check ✅
We now have a full monitoring suite with performant logging collector ! What a pretty massive subject done. At this stage, you have a good starting point to run many apps on your cluster with high scalability and observability. We are done for the pure operational part. It's finally time to tackle the building part for a complete development stack. Go [next part]({{< ref "/posts/16-a-beautiful-gitops-day-6" >}}) to begin with continuous integration.