Network-Aware Dynamic Pod Scaling with Custom Metrics

Introduction

Many modern enterprise applications have specific resource requirements that can be challenging to accommodate within the constraints of cloud-based infrastructure and licensing models. This case study explores how a leading automotive manufacturer addressed the scaling of a network-intensive applications.

Requirement :

Customer is looking to implement scaling of Pods based on Network Bandwidth Usage. Out of the box, CPU and Memory can be used to create HPA to create more replicas of the pods. There is a need to implement a custom mechanism to ship out the custom metric : Network Bandwidth out of the pods and create a rule around the metric to monitor it and scale out based on the threshold value.

Solution :

Currently there is a need to modify the application to ship out the custom metric and implement the cascading constructs based on the metric. In the below sections, there is a step by step guide of how to implement the same.

Following is a list of components you should install on the cluster itself.

Pre Installed Components in the Cluster :

Karpenter and EKS

Purpose: Automatically provisions new nodes in response to unschedulable pods
Installation: Typically installed via Helm (as discussed below) or Kubernetes manifests

You can install eksnodeviewer for visualizing dynamic node usage within a cluster.

Karpenter uses a custom resource called Provisioner to define how it should provision nodes. Here's an example configuration which specifies limitations of network-bandwith:

We will set a deployment to deploy our network heavy application

Than we will scale the application on to the nodes.

Prometheus

We will use helm charts to install the prometheus

Once the repo is updated, we will create the namespace

Here is the sample prometheus-value.yaml file which will be used for setting up the prometheus deployment.

We can verify the installation by the following commands:

Following commands can be used to access Prometheus and Grafana:

Application Modification

Here is a sample application which is written in python and has a rest API endpoint to download a file and also ship out the metric.

Registry Configuration : Image to be used in the Deployment

We will be using the above application in the above deployment as part of the container image described above.

To push the create a container and push the image in the register, We will create a DockerFile with following setup and starting building the image

In the requirements.txt, We will use the following libraries:

Once both the files are created, We will run the following command to build and push the registry

This will create an image and upload it in the registry which will be used in the deployment of this application in the Kubernetes cluster.

Prometheus Configuration

There are 2 parts of the prometheus setup that needs to be configured in this solutions.

Prometheus Rules : Prometheus rules allow you to precompute frequently needed or computationally expensive expressions and save their results as a new set of time series.

Prometheus adapter : The Prometheus Adapter is a bridge between Prometheus metrics and the Kubernetes custom metrics API. Key functions are translating Prometheus queries into metrics the Kubernetes API understands and allowing HPAs to use Prometheus metrics for scaling decisions.

Horizontal Pod Autoscaler (HPA)

This HPA configuration automates scaling of the 'network-app' deployment based on two custom metrics:

Network bandwidth usage: Scales when average usage exceeds 5MB/s per pod.
Active requests: Scales when average active requests exceed 3 per pod.

It maintains between 1 and 10 replicas, scaling up quickly (by 2 pods or 50%, whichever is greater, every 30 seconds) and scaling down conservatively (by 1 pod or 20%, whichever is less, every 60 seconds). It uses a short 30-second window for scale-up decisions and a longer 5-minute window for scale-down to prevent rapid fluctuations.

Overview :

This solution enables Kubernetes to autoscale pods based on network bandwidth usage by:

Exporting custom metrics from the application.
Using Prometheus to collect and process these metrics.
Configuring Prometheus Adapter to make the metrics available to Kubernetes.
Scale up Pods on Network intensive nodes as defined in the karpenter nodepool
Setting up an HPA to use these custom metrics for scaling decisions.

By implementing this system, the customer can effectively scale their application based on network bandwidth usage, complementing the standard CPU and memory-based scaling capabilities of Kubernetes.

Do you want to get in touch?

Are you building tools or containerized systems? Are you a startup founder and want to discuss your startup with AWS startup experts and the authors of this article? Book your 1:1 meeting here!

Authors

Aritra Nag- Specialist Solutions Architect, AWS Migrations and Modernization.

Gokhan Kurt - Sr Global Solutions Architect, AWS Automotive.

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Select your cookie preferences

Site Terms, Privacy, and more.