AWS | Community | Picturesocial - How to add just-in-time compute capacity to a Kubernetes Cluster

This is a 8-part series about Picturesocial:

In the face of uncertainty, we use logic, positive thinking, prayer, or whatever else seems like it might make an outcome more dependable. But sometimes you just need the right tool. For example, APIs often use frameworks to predict demand, but they cannot always provision compute capacity to serve it exactly when it's needed -- or Just in Time. Facing this uncertain availability, what tool can help?

This may seem obvious for Serverless workloads, as its handled by the cloud provider, but for Kubernetes we use scaling strategies, like HPA (Horizontal Pod Autoscaler), that we explored in previous posts. But HPA needs compute capacity from the Node Groups to schedule new pods, otherwise the new ones are evicted. Let’s take a look at how this process still won't solve the problem above.

POD and worker autoscaling in Kubernetes

We have a node with 4 pods and around 35% of compute power free to scale current deployments.
An increase in the demand made a deployment scale to one extra replica. We now have a node with 5 pods and around 17% of compute power free to scale current deployments.
Another increase in the demand made a deployment scale to one more replica. We now have a node with 6 pods and around 5% of compute power free.
The demand is still going up and the auto-scaling rules forced Kubernetes Scheduler to add one more pod, but we don’t have compute power to serve the demand and we evicted the new pod.
Kubernetes realizes that it needs more nodes and due to worker autoscaling rules it schedule a new EC2 instance and deploys the pod into the new node.

In the real world that extra node scheduled to cover the demand will take more than 10 minutes to be ready, the users are already there and we may lose customer trust for not serving the demand. This is where we need something that observes the aggregate resource requests of unscheduled pods to make decisions on launch and terminate nodes to minimize scheduling latencies and infrastructure costs. This is where Karpenter comes to the rescue.

Karpenter is an open source project created by AWS that helps you solve this problem by having Just in Time nodes to serve the uncertain demand. Today we are gonna learn how to implement it and how it looks on your Kubernetes Cluster.

Pre-requisites:

An AWS Account
If you are using Linux or MacOS you can continue to the next bullet point. If you are using Microsoft Windows, I suggest you to use WSL2
Install Git
Install AWS CLI 2
Install .NET 6
Install ekctl
Read the EKS Workshop - I used it for some parts of this blog post and if you wanna go deep that’s the place to explore! 🚀

If this is your first time working with AWS CLI or you need a refresh on how to set up your credentials, I suggest you follow this step-by-step of how to configure your local environment. In this same link you can also follow steps to configure Cloud9, that will be very helpful if you don’t want to install everything from scratch.

Walkthrough

We are going to start opening our Terminal and creating a variable to set the Karpenter version that we are gonna use as well as the region, that in our case is us-east-1, the cluster name and finally the identify profile of the current session.

Now it’s time to add Karpenter as Cluster extension by running a CloudFormation script.

This is how it looks like, the CloudFormation downloaded from karpenter.sh, that we are using.

We are going to use eksctl command line tool to create an IAM Identity mapping to our cluster, this will create the Karpenter role node to our config map as well as allow the nodes to be managed by the cluster.

Once this is done, we can check if the config map is ready by running the following command and looking if a config map named Karpenter-{something} is in place.

We are going to create an Open ID Connect (OIDC) provider for our Cluster. This is needed for establishing the trust relationship between Karpenter and our Cluster.

And finally we create the Kubernetes Service Account to give Karpenter permissions to launch new instances.

We are going to use Helm to install in Kubernetes the Karpenter dependencies (Config Maps, Pods, Services, etc), but first we add the repo from karpenter.sh

And we install the Helm chart.

You can now check if everything is properly installed and running before continuing further.

This YAML contains the provisioner for on-demand nodes. This is what it does: a/ Requirements: implements new on-demand nodes that are extra-large or bigger, b/ Limits: The provisioner will not use more than 1000 virtual cores and 1000GB of RAM. c/ ttlSecondsAfterEmpty: How many seconds until an empty node is terminated.

WARNING: This next step can generate a considerable consumption if you don’t delete it after this step by step.

It’s time to test if it really worked! I’m going to use the pause image. You have to change the amount of replicas from 0 to at least 1 to see how this triggers a just-in-time scale up to the node group.

Once you run this command, you are going to have at least 1 new node ready in the following 1-2 minutes, and this is how Just-in-Time compute with Kubernetes and Karpenter allows us to scale further in a very easy and agile way to serve uncertain demand. Plus, it allows you to put your effort toward innovation through the application without spending so much time on infrastructure operation.
Don’t forget to delete the deployment by running:

And we've arrived at the end of the series. I hope you enjoyed and learned from this journey as much as I did. In this series, we learned about containers, databases, orchestrators, API gateways, autoscaling, microservices, ML services, security, and many other things together. Now it's time for you to start your own projects, and I look forward to joining paths together in the future. If you have any problem please add an issue into ourGitHub repo, also if you want to learn more about Kubernetes on AWS here are some good deep dive resources that you can use:

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Select your cookie preferences

Site Terms, Privacy, and more.

Picturesocial - How to add just-in-time compute capacity to a Kubernetes Cluster

Adding compute power to a Kubernetes Cluster can be challenging because of the lengthy delay between compute demand and instance availability. In this post we are going to learn about Karpenter, an open source project that helps to have the power you need just when you need it.

Pre-requisites:

Walkthrough

Comments