Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

AWS Logo
Menu
Network-Aware Dynamic Pod Scaling with Custom Metrics

Network-Aware Dynamic Pod Scaling with Custom Metrics

Using native Kubernetes components with custom metrics for intelligent, resource-efficient scaling of network-intensive components.

Aritra Nag
Amazon Employee
Published Mar 24, 2025

Introduction

Many modern enterprise applications have specific resource requirements that can be challenging to accommodate within the constraints of cloud-based infrastructure and licensing models. This case study explores how a leading automotive manufacturer addressed the scaling of a network-intensive applications.

Requirement :

Customer is looking to implement scaling of Pods based on Network Bandwidth Usage. Out of the box, CPU and Memory can be used to create HPA to create more replicas of the pods. There is a need to implement a custom mechanism to ship out the custom metric : Network Bandwidth out of the pods and create a rule around the metric to monitor it and scale out based on the threshold value.

Solution :

Currently there is a need to modify the application to ship out the custom metric and implement the cascading constructs based on the metric. In the below sections, there is a step by step guide of how to implement the same.
Following is a list of components you should install on the cluster itself.
Pre Installed Components in the Cluster :
Karpenter and EKS
  • Purpose: Automatically provisions new nodes in response to unschedulable pods
  • Installation: Typically installed via Helm (as discussed below) or Kubernetes manifests

You can install eksnodeviewer for visualizing dynamic node usage within a cluster.
Karpenter uses a custom resource called Provisioner to define how it should provision nodes. Here's an example configuration which specifies limitations of network-bandwith:
We will set a deployment to deploy our network heavy application
Than we will scale the application on to the nodes.
Prometheus
We will use helm charts to install the prometheus
Once the repo is updated, we will create the namespace
Here is the sample prometheus-value.yaml file which will be used for setting up the prometheus deployment.
We can verify the installation by the following commands:
Following commands can be used to access Prometheus and Grafana:
Application Modification
Here is a sample application which is written in python and has a rest API endpoint to download a file and also ship out the metric.
Registry Configuration : Image to be used in the Deployment
We will be using the above application in the above deployment as part of the container image described above.
To push the create a container and push the image in the register, We will create a DockerFile with following setup and starting building the image
In the requirements.txt, We will use the following libraries:
Once both the files are created, We will run the following command to build and push the registry
This will create an image and upload it in the registry which will be used in the deployment of this application in the Kubernetes cluster.
Prometheus Configuration
There are 2 parts of the prometheus setup that needs to be configured in this solutions.
Prometheus Rules : Prometheus rules allow you to precompute frequently needed or computationally expensive expressions and save their results as a new set of time series.
Prometheus adapter : The Prometheus Adapter is a bridge between Prometheus metrics and the Kubernetes custom metrics API. Key functions are translating Prometheus queries into metrics the Kubernetes API understands and allowing HPAs to use Prometheus metrics for scaling decisions.
Horizontal Pod Autoscaler (HPA)
This HPA configuration automates scaling of the 'network-app' deployment based on two custom metrics:
  1. Network bandwidth usage: Scales when average usage exceeds 5MB/s per pod.
  2. Active requests: Scales when average active requests exceed 3 per pod.
It maintains between 1 and 10 replicas, scaling up quickly (by 2 pods or 50%, whichever is greater, every 30 seconds) and scaling down conservatively (by 1 pod or 20%, whichever is less, every 60 seconds). It uses a short 30-second window for scale-up decisions and a longer 5-minute window for scale-down to prevent rapid fluctuations.
Overview :
This solution enables Kubernetes to autoscale pods based on network bandwidth usage by:
  • Exporting custom metrics from the application.
  • Using Prometheus to collect and process these metrics.
  • Configuring Prometheus Adapter to make the metrics available to Kubernetes.
  • Scale up Pods on Network intensive nodes as defined in the karpenter nodepool
  • Setting up an HPA to use these custom metrics for scaling decisions.
By implementing this system, the customer can effectively scale their application based on network bandwidth usage, complementing the standard CPU and memory-based scaling capabilities of Kubernetes.
Do you want to get in touch?
Are you building tools or containerized systems? Are you a startup founder and want to discuss your startup with AWS startup experts and the authors of this article? Book your 1:1 meeting here!

Authors

Aritra Nag- Specialist Solutions Architect, AWS Migrations and Modernization.
Gokhan Kurt - Sr Global Solutions Architect, AWS Automotive.
 

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

3 Comments