Kubernetes: Horizontal Pod Autoscaling (HPA)

Manual Scaling

Whenever you see the spike in the QPS for your application you have two options: Horizontal scaling or Vertical scaling.

Horizontal Pod Auto-scaler (HPA)

HPA is a component of the Kubernetes that can automatically scale the numbers of pods. The K8s controller that is responsible for auto-scaling is known as Horizontal Controller.

  • Fetch the desired metrics from the pods
  • Compute the targeted number of replicas by comparing the fetched metrics value to the targeted metric value.
  • Replica count is updated in the scalable resource eg. Deployment

Pod Metrics

Kubelet has a component known as cAdvisor which fetches the metrics from the pods. Heapster aggregate the metrics. HPA can fetch the metrics from the Heapster via the REST APIs.

Compute targeted pod count

Number of pods = (60 + 90 + 50) / 50 = 4

Scalable Resource Update

Now we have computed the number of pods required. But to turn them into actual running pods, HPA updates the scalable resource (e.g. Deployment, Replica-set, etc) configuration.

Auto-Scaling Process

Combining all the above steps combined process is as the follows

Kubectl commands

Create HPA resources on a deployment

kubectl autoscale deployment testAppDeployment testAppHpa --cpu-percentage=30 --min=1 --max=5
kubetl get hpa -A
kubectl describe hpa testAppHpa 
kubectl get hpa testAppHpa -o yaml



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ajay Yadav

Ajay Yadav

Believer of Distributed Systems