About the Kubernetes Cluster Autoscaler configuration and saving cost

January 29, 2024

Hello there,

If you are using Kubernetes, you are most likely using a tool to automatically scale your cluster nodes up and down based on the number of pods running on your cluster and the resource usage (CPU/Memory). Two common solutions are https://github.com/kubernetes/autoscaler and https://karpenter.sh/. Well, let me tell you something about the former.

I suppose you would install Cluster Autoscaler using the provided helm chart. If you don't know/check carefully the default CLIs flags passed to the container, you might miss the 2 flags: skip-nodes-with-local-storage: true and skip-nodes-with-system-pods: true set to true by default (link to helm chart values).

It can be tricky to notice these flags, as most of the examples already toggle the flag to false, like this AWS example. But what are they doing will you ask?

skip-nodes-with-local-storage tells the Autoscaler to not scale down nodes with local storage, e.g. EmptyDir or HostPath. Meaning if the autoscaler notices a node with 1 deployment with an EmptyDir volume, it will not scale down the node even if it is under-utilized.

EmptyDir is often used as a local storage cache that can be discarded at any time without issues. If your container needs to stay on the same node and is not stateless, then there is a bigger issue than the CA configuration 🤡. Always use stateful components with stateful APIs: PersistentVolume, StatefulSet, etc, or remote storage.

For the flag skip-nodes-with-system-pods, the documentation is clear enough:

"If true cluster autoscaler will never delete nodes with pods from kube-system (except for DaemonSet or mirror pods)."

So let's say you deploy cluster autoscaler pods in kube-system, vertical pod autoscaler pods as well (admission controller, recommender and updated), and they all end-up on different nodes, it will block 4 nodes from downscalig by default 🥲

Anyway, to check if your autoscaler uses the flags, you can run the following shell commands:

CA_NAME=cluster-autoscaler-aws-cluster-autoscaler
CA_NAMESPACE=kube-system

kubectl get deployment $CA_NAME -n $CA_NAMESPACE -o=jsonpath='{.spec.template.spec.containers[0].command}' | jq

Hope it was helpful :)