We will deploy an application and expose as a service on TCP port 80.
The application is a custom-built image based on the php-apache image. The index.php page performs calculations to generate CPU load. More information can be found here
kubectl create deployment php-apache --image=us.gcr.io/k8s-artifacts-prod/hpa-example kubectl set resources deploy php-apache --requests=cpu=200m kubectl expose deploy php-apache --port 80 kubectl get pod -l app=php-apache
This HPA scales up when CPU exceeds 50% of the allocated container resource.
kubectl autoscale deployment php-apache `#The target average CPU utilization` \ --cpu-percent=50 \ --min=1 `#The lower limit for the number of pods that can be set by the autoscaler` \ --max=10 `#The upper limit for the number of pods that can be set by the autoscaler`
View the HPA using kubectl. You probably will see
<unknown>/50% for 1-2 minutes and then you should be able to see
kubectl get hpa
Open a new terminal in the Cloud9 Environment and run the following command to drop into a shell on a new container
kubectl run -i --tty load-generator --image=busybox /bin/sh
Execute a while loop to continue getting http:///php-apache
while true; do wget -q -O - http://php-apache; done
In the previous tab, watch the HPA with the following command
kubectl get hpa -w
You will see HPA scale the pods from 1 up to our configured maximum (10) until the CPU average is below our target (50%)
You can now stop (Ctrl + C) load test that was running in the other terminal. You will notice that HPA will slowly bring the replica count to min number based on its configuration. You should also get out of load testing application by pressing Ctrl + D.