Run Kubernetes Clusters for Less with Amazon EC2 Spot and Karpenter

Learn how to run Kubernetes clusters for up to 90% off with Amazon Elastic Kubernetes Service (EKS), Amazon EC2 Spot Instances, and Karpenter - all in less than 60 minutes.

Christian Melendez
Amazon Employee
Published Sep 7, 2023
Last Modified Apr 26, 2024
One of the main cost factors for Kubernetes clusters relies on the compute layer for the data plane. Running Kubernetes clusters on Amazon EC2 Spot Instances is a great way to reduce your compute costs significantly. When using Spot Instances, you can get up to a 90% price discount compared to On-Demand Instances.
Spot is a great match for workloads that are stateless, fault-tolerant, and flexible applications such as big data, containerized workloads, CI/CD, web servers, high-performance computing (HPC), and test & development workloads. Containers often match with these characteristics as they’re Spot-friendly. For non Spot-friendly workloads, like stateful applications within your cluster, you can continue using On-Demand Instances.
To optimize data place capacity further, you can adjust the number of nodes when pods are unscheduable due to available capacity, or remove nodes when they’re no longer needed. For automatic nodes adjustment, use either Cluster Autoscaler (CA) or Karpenter. Both tools have support for Spot, and in this tutorial I’ll focus on Karpenter.
I’ll guide you on the steps you need to follow to configure an EKS cluster with Spot instances and Karpenter. Additionally, I’ll show you how to configure a workload to see Karpenter in action by provisioning the required capacity using Spot Instances.

Karpenter is an open-source node provisioning project built for Kubernetes. As new pods continue coming to your cluster, either because you increased the number of replicas manually or through an Horizontal Pod Autoscaling (HPA) policy or through a Kubernetes Event-driven Autoscaling (KEDA) event, at some point your data plane nodes will be at full capacity, causing you to have pending (unschedulable) pods. The Karpenter controller reacts to this problem, and aggregates the capacity of these pending pods by evaluating scheduling constraints (resource requests, nodeselectors, affinities, tolerations, and topology spread constraints). Then, Karpenter provisions the right nodes that meet the requirements of these pending pods.
One of the main advantages of using Karpenter is the simplicity of configuring Spot best practices like instance type diversification (multiple families, sizes, generations, etc) in what Karpenter calls a NodePool. If you’re getting started with Spot in Amazon Elastic Kubernetes Service (EKS) or are struggling with the complexity of configuring multiple node groups, I recommend using Karpenter. However, if you’re already using CA and want to start spending less, you can find the detailed configuration to use Spot with CA here.
✅ AWS experienceAdvanced - 300
⏱ Time to complete75 minutes
💰 Cost to complete< $10.00 USD
🧩 Prerequisites- AWS Account
- AWS CLI
- Kubernetes CLI (kubectl)
- Terraform CLI
- Helm
📢 FeedbackAny feedback, issues, or just a 👍 / 👎 ?
💾 CodeDownload the code
🛠 Contributors@jakeskyaws
⏰ Last Updated2024-4-26

Step 1: Create a Cloud9 Environment

💡 Tip: You can skip this step if you already have a Cloud9 environment or if you’re planning to run all steps on your own computer. Just make sure you have the proper permissions listed in the pre-requisites section of this tutorial.
💡 Tip: You can control in which region to launch the Cloud9 environment by setting up the AWS_REGION environment variable.
I’ve prepared an AWS CloudFormation template to create a Cloud9 environment. It has all the tools to follow this tutorial like kubectl and Terraform CLI. You can either create the CloudFormation stack through the AWS Console, or do it through the command line. You can download the CloudFormation template here. I'm going to give you all the commands you need to run to create the stack using the CLI.
💡 IMPORTANT: You need to use the same IAM user/role both in the AWS Console and the AWS CLI setup. Othewrise, when you try to open the Cloud9 environment you won't have permissions to do it.
Before you create the Cloudformation stack, you need to get a public subnet ID to launch the Cloud9 instance with Internet access. Once you get it, set the following environment variable:
Now, let’s create the Cloud9 environment running the following command:
Wait 3-5 minutes after CloudFormation finishes, then open the Cloud9 console and open the environment. From now on, you’ll be running all commands in this tutorial in the Cloud9 terminal.
Cloud9 normally manages IAM credentials dynamically. This isn’t currently compatible with the EKS IAM authentication, so you need to disable it and rely on the IAM role instead. To do so, run the following commands in the Cloud9 terminal:
To confirm that you have all the CLI tools needed for this tutorial, input these commands:
💡 NOTE: If the CloudFormation stack has not reached the "CREATE_COMPLETE" status, the CLI tools may not have been installed yet. Please wait until the stack completes before proceeding with any CLI commands..

Step 2: Create an EKS Cluster with Karpenter Using EKS Blueprints for Terraform

💡 Tip: The Terraform template used in this tutorial is using an On-Demand managed node group to host the Karpenter controller. However, if you have an existing cluster, you can use an existing node group with On-Demand instances to deploy the Karpenter controller. To do so, you need to follow the Karpenter getting started guide.
In this step you'll create an Amazon EKS cluster using the EKS Blueprints for Terraform project. The Terraform template you’ll use in this tutorial is going to create a VPC, an EKS control plane, and a Kubernetes service account along with the IAM role and associate them using IAM Roles for Service Accounts (IRSA) to let Karpenter launch instances. Additionally, the template configures the Karpenter node role to the aws-auth configmap to allow nodes to connect, and creates an On-Demand managed node group for the kube-system and karpenter namespaces.
To create the cluster, run the following commands:
Once complete (after waiting about 15 minutes), run the following command to update the kube.config file to interact with the cluster through kubectl:
💡 Tip: If you’re using a different region or changed the name of the cluster, you can get the previous command for your setup from the Terraform output by running this command: terraform output -raw configure_kubectl.
You need to make sure you can interact with the cluster and that the Karpenter pods are running:

Step 3: Set Up a Karpenter NodePool

The EKS cluster already has a static managed node group configured in advance for the kube-system and karpenter namespaces, and it’s going to be only one you’ll need. For the rest of pods, Karpenter will launch nodes through a NodePool CRD. The NodePool sets constraints on the nodes that can be created by Karpenter and the pods that can run on those nodes. A single Karpenter NodePool is capable of handling many different pod shapes, and for this tutorial you’ll only create the default NodePool.
💡 Tip: Karpenter simplifies the data plane capacity management using an approach called group-less auto scaling. This is because Karpenter is no longer using node groups, which match with Auto Scaling groups, to launch nodes. Over time, clusters using the paradigm of running different types of applications (that require different capacity types), end up with a complex configuration and operational model where node groups must be defined and provided in advance.
Before you continue, you need to enable your AWS account to launch Spot instances if you haven't launch any yet. To do so, create the service-linked role for Spot by running the following command:
If the role has already been successfully created, you will see:
You don't need to worry about this error, you simply had to run the above command to make sure you have the service-linked role to launch Spot instances.
Now, you need to create two environment variables that we’ll use next. The values you need can be obtained from the Terraform output variables. Make sure you’re in the same folder where the Terraform main.tf file lives and run the following command:
💡 NOTE: If you're working with an existing EKS cluster, make sure to set the proper values for the previous environment variables as we'll use those values to setup the Karpenter provsioner.
Let’s create a default NodePool by running the following commands:
Karpenter is now active and ready to begin provisioning nodes.
Let me highlight a few important settings from the default NodePool you just created:
  • requirements: Here’s where you define the type of nodes Karpenter can launch. Be as flexible as possible and let Karpenter choose the right instance type based on the pod requirements. For this NodePool, you’re saying Karpenter can launch either Spot or On-Demand Instances, families including c, m and r, with a minimum of 4 vCPUs and 8 GiB of memory. With this configuration, you’re choosing around 150 instance types from the 700+ available today in AWS. Read the next section to understand why this is important.
  • limits: This is how you constrain the maximum amount of resources that the NodePool will manage. Karpenter can launch instances with different specs, so instead of limiting a max number of instances (as you’d typically do in an Auto Scaling group), you define a maximum of vCPUs or memory to limit the number of nodes to launch. Karpenter provides a metric to monitor the percentage usage of this NodePool based on the limits you configure.
  • disruption: Karpenter does a great job at launching only the nodes you need, but as pods can come an go, at some point in time the cluster capacity can end up in a fragmented state. To avoid fragmentation and optimize the compute nodes in your cluster, you can enable consolidation. When enabled, Karpenter automatically discovers disruptable nodes and spins up replacements when needed.
  • expireAfter: Here’s where you define when a node will be deleted. This is useful to force new nodes with up-to-date AMI’s. In this example we have set the value to 7 days.
  • nodeClassRef: This is where you reference the template to launch a node. An EC2NodeClass is where you define which subnets, security groups, and IAM role the nodes will use. You can set node tags or even configure a user-data. To learn more about which other configurations are available, go here.
You can also learn more about which other configuration properties are available for a NodePool here.

Why Is It a Good Practice To Configure a Diverse Set of Instance Types?

As you noticed, with the above NodePool we’re basically letting Karpenter choose from a diverse set of instance types to launch the best instance type possible. If it’s an On-Demand Instance, Karpenter uses the lowest-price allocation strategy to launch the cheapest instance type that has available capacity. When you use multiple instance types, you can avoid the InsufficientInstanceCapacity error.
If it’s a Spot Instance, Karpenter uses the price-capacity-optimized (PCO) allocation strategy. PCO looks at both price and capacity availability to launch from the Spot Instance pools that are the least likely to be interrupted and have the lowest possible price. For Spot Instances, applying diversification is key. Spot Instances are spare capacity that can be reclaimed by EC2 when it is required. Karpenter allows you to diversify extensively to replace reclaimed Spot Instances automatically with instances from other pools where capacity is available.

Step 4: Deploy a Spot-friendly Workload

You’re now going to see Karpenter in action. Your default NodePool can launch both On-Demand and Spot Instances, but Karpenter considers the constraints you configure within a pod to launch the right node(s). Let’s create a Deployment with a nodeSelector to run the pods on Spot instances. To do so, run the following command:
As there are no nodes that match the pod’s requirements, all pods will be Pending, making Karpenter react and launch the nodes, similar to this output:
Review Karpenter logs to see what’s happening while you wait for the new node to be ready. Create the following alias:
Karpenter logs should look similar to this (I’m including only the lines I want to highlight):
By reading the logs, you can see that Karpenter:
  • Noticed there were 10 pending pods, and decided that can fit all pods in only one node.
  • Is considering the kubelet and kube-proxy Daemonsets (2 additional pods), and is aggregating all resources need for 12 pods. Moreover, Karpenter noticed that 100 instance types match these requirements.
  • Launched an c7g.2xlarge Spot Instance in eu-west-2a as this was the pool with more spare capacity with lowest price.

Step 5: Spread Pods Within Multiple AZs

Karpenter launched only one node for all pending pods. However, putting all your eggs in the same basket is not recommended, as if you lose that node, you’ll need to wait for Karpenter to provision a replacement node (which can be fast, but still, you’ll see an impact). To avoid this, and to make the workload more highly available, let’s spread the pods within multiple AZs. Let’s configure a Topology Spread Constraint (TSP) within the Deployment.
Before you continue, remove the stateless Deployment:
💡 NOTE: To see pods being spread within AZs withh similar instance sizes, wait until pods and existing EC2 instances launched by Karpenter are removed.
To configure a TSP, add the following snippet between the nodeSelector and the containers block from the workload.yaml file you downloaded before:
💡 Tip: You can download the full version of the deployment manifest including the TSP here.
Create the stateless Deployment again. If you downloaded the manifest from GitHub, you can simply run:
Then, you can review the Karpenter logs and notice how different the actions are. Wait one minute and you should see the pods running within three nodes in different AZs:
You should see an output similar to this:

Step 6: (Optional) Simulate Spot Interruption

You can simulate a Spot interruption to test the resiliency of your applications. As I said before, Spot is spare capacity for steep discounts in exchange for returning them when EC2 needs the capacity back. Spot interruptions have a 2 minute notice before EC2 reclaims the instance. Karpenter can watch these interruptions (the cluster you created with Terraform is already configured this way). When this happens, the NodePool starts a new node as soon as it sees the Spot interruption warning. Karpenter’s average node startup time means that, generally, there is sufficient time for the new node to become ready and to move the pods to the new node before the node is reclaimed.
You can simulate a Spot interruption using Fault Injection Simulator (FIS). To do this, you can either do it through the console, or using the Amazon EC2 Spot Interrupter CLI.
In this tutorial, I’ll use a CloudFormation template to create a FIS experiment template, and then run an experiment to send a Spot interruption to one (randomly) instance launched by Karpenter. You first need to download the CloudFormation template:
Now let's create the FIS experiment template by running the following command:
Now, you’ll need two extra terminals: 1) to monitor the nodes STATUS, and 2) for the Karpenter logs. In one terminal watch the nodes using this command:
In another terminal, run the following commands:
In the third terminal, run the following command to send a Spot interruption:
Review what happens by looking at the Karpenter logs, as soon as the Spot interruption warning lands, Karpenter immediately cordons and drains the node, but also launches a replacement instance:
You can also go back to the terminal where you listed all the nodes, and you'll see how the interrupted instance was cordoned, and when the new instance was launched.
Alternatively to vizualise the consolidation process, you can use eks-node-viewer. eks-node-viewer a tool for visualizing dynamic node usage within a cluster. It was originally developed as an internal tool at AWS for demonstrating consolidation with Karpenter. It displays the scheduled pod resource requests vs the allocatable capacity on the node.
To launch it execute the following in a new Cloud9 terminal tab:
💡 Tip: You might end up seeing only one/two Spot nodes running, and if you review the Karpenter logs, you’ll see that it was because of the consolidation process.

Step 7: (Optional) Deploy a Stateful Workload

You can still launch On-Demand Instances in a cluster that’s also running Spot Instances for those non Spot-friendly workloads. Continue using the default Karpenter NodePool you created before. But make sure you’re configuring the workload properly. One way of doing it is to use a similar approach for the Spot-friendly workload by using a nodeSelector.
Deploy the following application to simulate a stateful workload:
Same as before, wait one minute, and you should see all pods running and one On-Demand node running:
You can review the Karpenter logs as well, you’ll see a similar behavior as before with the Spot-friendly workload.

Step 8: Clean Up

When you’re done with this tutorial, remove the two deployments you created:
Wait 30 seconds until the nodes that Karpenter launched are gone (due to consolidation), then remove all resources:

Conclusion

Using Spot Instances for your Kubernetes data plane nodes helps you reduce computing costs. As long as your workloads are fault-tolerant, stateless, and can use a variety of instance types, you can use Spot. Karpenter allows you to simplify the process of configuring your EKS cluster with a high-instance type diversification, and provisions only the capacity you need.

Workshops

You can learn more about using Karpenter on EKS with this hands-on workshop

Blueprints

Karpenter Blueprints is a repository that includes a list of common workload scenarios.

Docuementation

Dive deeper into the Karpenter concepts here
 

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Comments