= Kubernetes App Auto-scaling :toc: :icons: :linkcss: :imagesdir: ../../resources/images https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/[Horizontal Pod Autoscaling] (HPA) is a Kubernetes feature to dynamically increase/decrease the number of pod replicas based on resource utilization metrics. As of k8s version 1.9, the direction for HPA is to use the Metrics Server rather than https://github.com/kubernetes/heapster[Heapster]. HPA can automatically scale pods deployed in a replication controller, deployment, or a replica set. For additional information on how HPA works, check out the Kubernetes https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/[community documentation]. == Prerequisites In order to perform exercises in this chapter, you’ll need to deploy configurations to a Kubernetes cluster. To create an EKS-based Kubernetes cluster, use the link:../../01-path-basics/102-your-first-cluster#create-a-kubernetes-cluster-with-eks[AWS CLI] (recommended). If you wish to create a Kubernetes cluster without EKS, you can instead use link:../../01-path-basics/102-your-first-cluster#alternative-create-a-kubernetes-cluster-with-kops[kops]. Deploy the metrics server: $ kubectl apply -f https://raw.githubusercontent.com/kubernetes/kops/master/addons/metrics-server/v1.8.x.yaml == Deploy an application In this step, we deploy a simple Go web application and constrain the CPU resources just for the purposes of this test. $ kubectl run webapp --image=trevorrobertsjr/webapp --requests=cpu=50m --expose --port=8080 service "webapp" created deployment "webapp" created It also publishes the service at port 8080. == Horizontal Pod Autoscaler configuration Now that our application is running, we create a Horizonal Pod Autoscaler for our webapp deployment. $ kubectl autoscale deployment webapp --cpu-percent=10 --min=1 --max=10 deployment "webapp" autoscaled This command will mainain between 1 and 10 replicas of the pod. The autoscaler will increase or decrease the number of replicas to maintain average CPU utilization of 10% across all the pods. == Generate load The simplest method to do this would be to access the application in an infinite loop similar to the example in the Kubernetes Horizonal Pod Autoscaler documentation: First, deploy a busybox container, label it `load-generator` and attach to it's prompt: $ kubectl run -i --tty load-generator --image=busybox /bin/sh At the `load-generator` command prompt, run a continuous request of the webapp $ while true; do wget -q -O- http://webapp.default.svc.cluster.local:8080; done If for any reason you get disconnected from the load-generator container, you can re-attach to it with the following command. $ kubectl attach $(kubectl get pod | grep load | awk '{print $1}') -c load-generator -i -t In a different terminal window, check the status of the Horizontal Pod Autoscaler. $ kubectl get hpa -w You will see output similar to the following over successive queries of the hpa resource: $ kubectl get hpa -w NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE webapp Deployment/webapp 0% / 10% 1 10 1 6m webapp Deployment/webapp 62% / 10% 1 10 1 7m webapp Deployment/webapp 62% / 10% 1 10 4 7m webapp Deployment/webapp 112% / 10% 1 10 4 8m webapp Deployment/webapp 112% / 10% 1 10 4 8m webapp Deployment/webapp 53% / 10% 1 10 4 9m Notice that, eventually, the value in the `REPLICAS` column will increase as the load generator continues to run. == Stop load In the terminal window that is running the load generator, hit `Ctrl`+`C` to terminate the process. Again, run the `kubectl get hpa -w` command in your other terminal window, and you will see the number of replicas begin to decrease as the CPU load returns to 0%. It shows the output: NOTE: It takes a few minutes for the number of replicas to scale down. ``` $ kubectl get hpa -w NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE webapp Deployment/webapp 51% / 10% 1 10 4 10m webapp Deployment/webapp 51% / 10% 1 10 4 10m webapp Deployment/webapp 27% / 10% 1 10 4 11m webapp Deployment/webapp 27% / 10% 1 10 8 11m webapp Deployment/webapp 0% / 10% 1 10 8 12m webapp Deployment/webapp 0% / 10% 1 10 8 12m webapp Deployment/webapp 0% / 10% 1 10 8 13m webapp Deployment/webapp 0% / 10% 1 10 8 13m webapp Deployment/webapp 0% / 10% 1 10 8 14m webapp Deployment/webapp 0% / 10% 1 10 8 14m webapp Deployment/webapp 0% / 10% 1 10 8 15m webapp Deployment/webapp 0% / 10% 1 10 8 15m webapp Deployment/webapp 0% / 10% 1 10 8 16m webapp Deployment/webapp 0% / 10% 1 10 1 16m webapp Deployment/webapp 0% / 10% 1 10 1 17m ``` == Cleanup $ kubectl delete hpa/webapp deploy/load-generator deploy/webapp You are now ready to continue on with the workshop! :frame: none :grid: none :valign: top [align="center", cols="2", grid="none", frame="none"] |===== |image:button-continue-standard.png[link=../../05-path-next-steps/502-for-further-reading] |image:button-continue-developer.png[link=../../03-path-application-development/305-app-tracing-with-jaeger-and-x-ray] |link:../../standard-path.adoc[Go to Standard Index] |link:../../developer-path.adoc[Go to Developer Index] |=====