Amazon EKS Workshop > Beginner > Autoscaling with Karpenter > Automatic Node Provisioning

Automatic Node Provisioning

This workshop has been deprecated and archived. The new Amazon EKS Workshop is now available at www.eksworkshop.com.

With Karpenter now active, we can begin to explore how Karpenter provisions nodes. In this section we are going to create some pods using a deployment we will watch Karpenter provision nodes in response.

In this part off the workshop we will use a Deployments with the pause image. If you are not familiar with Pause Pods you can read more about them here.

Run the following command and try to answer the questions below:

cat <<EOF > inflate.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: inflate
spec:
  replicas: 0
  selector:
    matchLabels:
      app: inflate
  template:
    metadata:
      labels:
        app: inflate
    spec:
      nodeSelector:
        intent: apps
      containers:
        - name: inflate
          image: public.ecr.aws/eks-distro/kubernetes/pause:3.2
          resources:
            requests:
              cpu: 1
              memory: 1.5Gi
EOF
kubectl apply -f inflate.yaml

Challenge

You can use Kube-ops-view or just plain kubectl cli to visualize the changes and answer the questions below. In the answers we will provide the CLI commands that will help you check the resposnes. Remember: to get the url of kube-ops-view you can run the following command kubectl get svc kube-ops-view | tail -n 1 | awk '{ print "Kube-ops-view URL = http://"$4 }'

Answer the following questions. You can expand each question to get a detailed answer and validate your understanding.

1) Why did Karpenter not scale the cluster after making the initial deployment ?

Click here to show the answer

2) How would you scale the deployment to 1 replicas?

Click here to show the answer

To scale up the deployment run the following command:

kubectl scale deployment inflate --replicas 1

You can check the state of the replicas by running the following command. Once Karpenter provisions the new instance the pod will be placed in the new node.

kubectl get deployment inflate

3) Which instance type did Karpenter use when increasing the instances ? Why that instance ?

Click here to show the answer

You can check which instance type was used running the following command:

kubectl get node --selector=intent=apps --show-labels

This will show a single instance created with the label set to intent: apps. To get the type of instance in this case, we can describe the node and look at the label beta.kubernetes.io/instance-type

echo type: $(kubectl describe node --selector=intent=apps | grep "beta.kubernetes.io/instance-type" | sed s/.*=//g)

There is something even more interesting to learn about how the node was provisioned. Check out Karpenter logs and look at the new Karpenter created. The lines should be similar to the ones below

2022-07-01T03:00:19.634Z        INFO    controller.provisioning Found 1 provisionable pod(s)    {"commit": "1f7a67b"}
2022-07-01T03:00:19.634Z        INFO    controller.provisioning Computed 1 new node(s) will fit 1 pod(s)        {"commit": "1f7a67b"}
2022-07-01T03:00:19.790Z        DEBUG   controller.provisioning.cloudprovider   Discovered subnets: [subnet-0e528fbbaf13542c2 (eu-west-1b) subnet-0a9bd9b668d8ae58d (eu-west-1a) subnet-03aec03eee186dc42 (eu-west-1a) subnet-03ff683f2535bcd8d (eu-west-1b)]   {"commit": "1f7a67b", "provisioner": "default"}
2022-07-01T03:00:19.871Z        DEBUG   controller.provisioning.cloudprovider   Discovered security groups: [sg-076f0ca74b68addb2 sg-09176f21ae53f5d60] {"commit": "1f7a67b", "provisioner": "default"}
2022-07-01T03:00:19.873Z        DEBUG   controller.provisioning.cloudprovider   Discovered kubernetes version 1.21      {"commit": "1f7a67b", "provisioner": "default"}
2022-07-01T03:00:19.928Z        DEBUG   controller.provisioning.cloudprovider   Discovered ami-0413b176c68479e84 for query "/aws/service/eks/optimized-ami/1.21/amazon-linux-2/recommended/image_id"    {"commit": "1f7a67b", "provisioner": "default"}
2022-07-01T03:00:19.972Z        DEBUG   controller.provisioning.cloudprovider   Discovered launch template Karpenter-eksworkshop-eksctl-12663282710833670681    {"commit": "1f7a67b", "provisioner": "default"}
2022-07-01T03:00:23.013Z        INFO    controller.provisioning.cloudprovider   Launched instance: i-05e8535378b1caf35, hostname: ip-192-168-36-234.eu-west-1.compute.internal, type: c5a.xlarge, zone: eu-west-1b, capacityType: spot  {"commit": "1f7a67b", "provisioner": "default"}
2022-07-01T03:00:23.042Z        INFO    controller.provisioning Created node with 1 pods requesting {"cpu":"1125m","memory":"1536Mi","pods":"3"} from types t3a.xlarge, c6a.xlarge, c5a.xlarge, t3.xlarge, c6i.xlarge and 333 other(s)  {"commit": "1f7a67b", "provisioner": "default"}
2022-07-01T03:00:23.042Z        INFO    controller.provisioning Waiting for unschedulable pods  {"commit": "1f7a67b"}
2022-07-01T03:00:23.042Z        DEBUG   controller.events       Normal  {"commit": "1f7a67b", "object": {"kind":"Pod","namespace":"default","name":"inflate-b9d769f59-rcjnj","uid":"e0f98d1d-eaf6-46ff-9ea0-4d66a6842815","apiVersion":"v1","resourceVersion":"20925"}, "reason": "NominatePod", "message": "Pod should schedule on ip-192-168-36-234.eu-west-1.compute.internal"}

We explained earlier on about group-less cluster scalers and how that simplifies operations and maintenance. Let’s deep dive for a second into this concept. Notice how Karpenter picks up the instance from a diversified selection of instances. In this case it selected the following instances:

from types t3a.xlarge, c6a.xlarge, c5a.xlarge, t3.xlarge, c6i.xlarge and 333 other(s)

Note how the types, ‘nano’, ‘micro’, ‘small’, ‘medium’, ‘large’, where filtered for this selection. While our recommendation is to diversify on as many instances as possible, there are cases where provisioners may want to filter smaller (or specific) instances types.

Instances types might be different depending on the region selected.

All this instances are the suitable instances that reduce the waste of resources (memory and CPU) for the pod submitted. If you are interested in Algorithms, internally Karpenter is using a First Fit Decreasing (FFD) approach. Note however this can change in the future.

We did set Karpenter Provisioner to use EC2 Spot instances, and there was no instance-types requirement section in the Provisioner to filter the type of instances. This means that Karpenter will use the default value of instances types to use. The default value includes all instance types with the exclusion of metal (non-virtualized), non-HVM, and GPU instances.Internally Karpenter used EC2 Fleet in Instant mode to provision the instances. You can read more about EC2 Fleet Instant mode here. Here are a few properties to mention about EC2 Fleet instant mode that are key for Karpenter.

EC2 Fleet instant mode provides a synchronous call to procure instances, including EC2 Spot, this simplifies and avoid error when provisioning instances. For those of you familiar with Cluster Autoscaler on AWS, you may know about how it uses i-placeholder to coordinate instances that have been created in asynchronous ways.
The call to EC2 Fleet in instant mode is done using capacity-optimized-prioritized selecting the instances that reduce the likelihood of provisioning an extremely large instance. Capacity-optimized allocation strategies select instances from the Spot capacity pools with optimal capacity for the number of instances launched thus reducing the frequency of Spot terminations for the instances selected. You can read more about Allocation Strategies here
Calls to EC2 Fleet in instant mode are not considered as Spot fleets. They do not count towards the Spot Fleet limits. The implication is that Karpenter can make calls to this API as many times over time as needed.

By implementing techniques such as: Bin-packing using First Fit Decreasing, Instance diversification using EC2 Fleet instant fleet and capacity-optimized-prioritized, Karpenter removes the need from customer to define multiple Auto Scaling groups each one for the type of capacity constraints and sizes that all the applications need to fit in. This simplifies considerably the operational support of kubernetes clusters.

4) What are the new instance properties and Labels ?

Click here to show the answer

You can use the following command to display all the node attributes including labels:

kubectl describe node --selector=intent=apps

Let’s now focus in a few of those parameters starting with the Labels:

Labels:             ...
                    intent=apps
                    karpenter.sh/capacity-type=spot
                    node.kubernetes.io/instance-type=t3.medium
                    topology.kubernetes.io/region=eu-west-1
                    topology.kubernetes.io/zone=eu-west-1a
                    karpenter.sh/provisioner-name=default
                    ...

Note the node was created with the intent=apps as we did state in the Provisioner configuration
Same applies to the Spot configuration. Note how the karpenter.sh/capacity-type label has been set to spot
Karpenter AWS implementation will also add the Labels topology.kubernetes.io for region and zone.
Karpenter does support multiple Provisioners. Note how the karpenter.sh/provisioner-name uses the default as the Provisioner in charge of managing the instance lifecycle.

Another thing to note from the node description is the following section:

System Info:
  ...
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.4.6
  ...

The instance selected has been created with the default architecture Karpenter will use when the Provisioner CRD requirement for kubernetes.io/arch Architecture has not been provided.
The Container Runtime used for Karpenter nodes is containerd. You can read more about containerd here

5) Why did the newly created `inflate` pod was not scheduled into the managed node group ?

Click here to show the answer

6) How would you scale the number of replicas to 10? What do you expect to happen? Which instance types were selected in this case ?

Click here to show the answer

This one should be easy!

kubectl scale deployment inflate --replicas 10

This will set a few pods pending. Karpenter will get the pending pod signal and run a new provisioning cycle similar to the one below (confirm by checking Karpenter logs). This time, the capacity should get provisioned with a slightly different set of characteristics. Given the new size of aggregated pod requirements, Karpenter will check which type of instance diversification makes sense to use.

2022-07-01T03:13:32.754Z        INFO    controller.provisioning Found 7 provisionable pod(s)    {"commit": "1f7a67b"}
2022-07-01T03:13:32.754Z        INFO    controller.provisioning Computed 1 new node(s) will fit 7 pod(s)        {"commit": "1f7a67b"}
2022-07-01T03:13:32.824Z        DEBUG   controller.provisioning.cloudprovider   Discovered subnets: [subnet-0e528fbbaf13542c2 (eu-west-1b) subnet-0a9bd9b668d8ae58d (eu-west-1a) subnet-03aec03eee186dc42 (eu-west-1a) subnet-03ff683f2535bcd8d (eu-west-1b)]   {"commit": "1f7a67b", "provisioner": "default"}
2022-07-01T03:13:32.867Z        DEBUG   controller.provisioning.cloudprovider   Discovered security groups: [sg-076f0ca74b68addb2 sg-09176f21ae53f5d60] {"commit": "1f7a67b", "provisioner": "default"}
2022-07-01T03:13:32.868Z        DEBUG   controller.provisioning.cloudprovider   Discovered kubernetes version 1.21      {"commit": "1f7a67b", "provisioner": "default"}
2022-07-01T03:13:32.929Z        DEBUG   controller.provisioning.cloudprovider   Discovered ami-0413b176c68479e84 for query "/aws/service/eks/optimized-ami/1.21/amazon-linux-2/recommended/image_id"    {"commit": "1f7a67b", "provisioner": "default"}
2022-07-01T03:13:33.105Z        DEBUG   controller.provisioning.cloudprovider   Created launch template, Karpenter-eksworkshop-eksctl-12663282710833670681      {"commit": "1f7a67b", "provisioner": "default"}
2022-07-01T03:13:35.047Z        INFO    controller.provisioning.cloudprovider   Launched instance: i-004d9de653118ae9d, hostname: ip-192-168-27-254.eu-west-1.compute.internal, type: t3a.2xlarge, zone: eu-west-1a, capacityType: spot {"commit": "1f7a67b", "provisioner": "default"}
2022-07-01T03:13:35.074Z        INFO    controller.provisioning Created node with 7 pods requesting {"cpu":"7125m","memory":"10752Mi","pods":"9"} from types t3a.2xlarge, c6a.2xlarge, c5a.2xlarge, t3.2xlarge, c6i.2xlarge and 276 other(s)    {"commit": "1f7a67b", "provisioner": "default"}
2022-07-01T03:13:35.074Z        INFO    controller.provisioning Waiting for unschedulable pods  {"commit": "1f7a67b"}

Indeed the instances selected this time are larger ! The instances selected in this example were:

from types t3a.2xlarge, c6a.2xlarge, c5a.2xlarge, t3.2xlarge, c6i.2xlarge and 276 other(s)

Finally to check out the configuration of the intent=apps node execute again:

kubectl describe node --selector=intent=apps

This time around you’ll see the description for both instances created.

8) How would you scale the number of replicas to 0? what do you expect to happen?

Show me the answers

To scale the number of replicas to 0, run the following command:

kubectl scale deployment inflate --replicas 0

In the previous section, we configured the default Provisioner with ttlSecondsAfterEmpty set to 30 seconds. Once the nodes don’t have any pods scheduled on them, Karpenter will terminate the empty nodes using cordon and drain best practices

Let’s cover the second reason why we started with 0 replicas and why we also end with 0 replicas! Karpenter does support scale to and from Zero. Karpenter only launches or terminates nodes as necessary based on aggregate pod resource requests. Karpenter will only retain nodes in your cluster as long as there are pods using them.

What Have we learned in this section :

In this section we have learned:

Karpenter scales up nodes in a group-less approach. Karpenter select which nodes to scale , based on the number of pending pods and the Provisioner configuration. It selects how the best instances for the workload should look like, and then provisions those instances. This is unlike what Cluster Autoscaler does. In the case of Cluster Autoscaler, first all existing node group are evaluated and to find which one is the best placed to scale, given the Pod constraints.
Karpenter uses cordon and drain best practices to terminate nodes. The configuration of when a node is terminated can be controlled with ttlSecondsAfterEmpty
Karpenter can scale-out from zero when applications have available working pods and scale-in to zero when there are no running jobs or pods.
Provisioners can be setup to define governance and rules that define how nodes will be provisioned within a cluster partition. We can setup requirements such as karpenter.sh/capacity-type to allow on-demand and spot instances or use karpenter.k8s.aws/instance-size to filter smaller sizes. The full list of supported labels is available here

Automatic Node Provisioning

Challenge

1) Why did Karpenter not scale the cluster after making the initial deployment ?

2) How would you scale the deployment to 1 replicas?

3) Which instance type did Karpenter use when increasing the instances ? Why that instance ?

4) What are the new instance properties and Labels ?

5) Why did the newly created inflate pod was not scheduled into the managed node group ?

6) How would you scale the number of replicas to 10? What do you expect to happen? Which instance types were selected in this case ?

8) How would you scale the number of replicas to 0? what do you expect to happen?

What Have we learned in this section :

5) Why did the newly created `inflate` pod was not scheduled into the managed node group ?