Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

AWS Logo
Menu
Network-Aware Dynamic Pod Scaling with Custom Metrics

Network-Aware Dynamic Pod Scaling with Custom Metrics

Using native Kubernetes components with custom metrics for intelligent, resource-efficient scaling of network-intensive components.

Aritra Nag
Amazon Employee
Published Mar 24, 2025

Introduction

Many modern enterprise applications have specific resource requirements that can be challenging to accommodate within the constraints of cloud-based infrastructure and licensing models. This case study explores how a leading automotive manufacturer addressed the scaling of a network-intensive applications.

Requirement :

Customer is looking to implement scaling of Pods based on Network Bandwidth Usage. Out of the box, CPU and Memory can be used to create HPA to create more replicas of the pods. There is a need to implement a custom mechanism to ship out the custom metric : Network Bandwidth out of the pods and create a rule around the metric to monitor it and scale out based on the threshold value.

Solution :

Image not found
Currently there is a need to modify the application to ship out the custom metric and implement the cascading constructs based on the metric. In the below sections, there is a step by step guide of how to implement the same.
Following is a list of components you should install on the cluster itself.
Pre Installed Components in the Cluster :
Karpenter and EKS
  • Purpose: Automatically provisions new nodes in response to unschedulable pods
  • Installation: Typically installed via Helm (as discussed below) or Kubernetes manifests
Image not found

You can install eksnodeviewer for visualizing dynamic node usage within a cluster.
Karpenter uses a custom resource called Provisioner to define how it should provision nodes. Here's an example configuration which specifies limitations of network-bandwith:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
cat <<EoF> basic-networking-nodepool.yaml
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: higher-bandwidth-usage
spec:
disruption:
budgets:
- nodes: 10%
consolidateAfter: 30s
consolidationPolicy: WhenEmptyOrUnderutilized
template:
metadata:
labels:
workload-type: network-intensive
network-tier: high
spec:
expireAfter: 336h
nodeClassRef:
group: eks.amazonaws.com
kind: NodeClass
name: default
requirements:
- key: karpenter.sh/capacity-type
operator: In
values:
- spot
# Specifically targeting network-optimized instance families
- key: eks.amazonaws.com/instance-network-bandwidth
operator: Gt
values: ["10000"]
- key: eks.amazonaws.com/instance-generation
operator: Gt
values:
- "4"
- key: kubernetes.io/arch
operator: In
values:
- amd64
- arm64
- key: kubernetes.io/os
operator: In
values:
- linux
- bottlerocket
taints:
- key: workload-type
value: network-intensive
effect: NoSchedule
terminationGracePeriod: 24h0m0s
EoF
kubectl apply -f basic-networking-nodepool.yaml
We will set a deployment to deploy our network heavy application
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
cat <<EoF> basic-networking-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: network-app
spec:
replicas: 1
selector:
matchLabels:
app: network-app
template:
metadata:
labels:
app: network-app
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8000"
prometheus.io/path: "/metrics"
spec:
containers:
- name: network-app
image: your-registry/network-app:v6
ports:
- containerPort: 8080
name: http
- containerPort: 8000
name: metrics
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
EoF
kubectl apply -f basic-networking-deployment.yaml
Than we will scale the application on to the nodes.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
cat <<EoF> basic-networking-hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: network-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: network-app
minReplicas: 1
maxReplicas: 10
metrics:
- type: Pods
pods:
metric:
name: network_bandwidth_usage
selector:
matchLabels:
namespace: default
target:
type: AverageValue
averageValue: "8388608" # 8MB/s
EoF
kubectl apply -f basic-networking-hpa.yaml
Prometheus
We will use helm charts to install the prometheus
1
2
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
Once the repo is updated, we will create the namespace
1
kubectl create namespace prometheus
Here is the sample prometheus-value.yaml file which will be used for setting up the prometheus deployment.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
cat <<EoF> prometheus-values.yaml

prometheus:
prometheusSpec:
retention: 15d
storageSpec:
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 1000m
memory: 1Gi

alertmanager:
enabled: true
config:
global:
resolve_timeout: 5m
route:
group_wait: 30s
group_interval: 5m
repeat_interval: 12h
receiver: 'null'
routes:
- match:
alertname: Watchdog
receiver: 'null'
receivers:
- name: 'null'
EoF

helm install prometheus prometheus-community/kube-prometheus-stack --namespace prometheus --create-namespace --values prometheus-values.yaml
We can verify the installation by the following commands:
1
2
3
4
# Check pods
kubectl get pods -n prometheus
# Check services
kubectl get svc -n prometheus
Following commands can be used to access Prometheus and Grafana:
1
2
3
4
# Port forward Prometheus
kubectl port-forward -n prometheus svc/prometheus-kube-prometheus-prometheus 9090:9090
# Port forward Grafana
kubectl port-forward -n prometheus svc/prometheus-grafana 3000:80
Application Modification
Here is a sample application which is written in python and has a rest API endpoint to download a file and also ship out the metric.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
from flask import Flask, Response
from prometheus_client import generate_latest, Counter, Gauge, CONTENT_TYPE_LATEST, start_http_server

app = Flask(__name__)

# Define Prometheus metrics
BYTES_TRANSFERRED = Counter('network_bytes_total', 'Total bytes transferred')
ACTIVE_REQUESTS = Gauge('network_active_requests', 'Number of active requests')

@app.route('/metrics')
def metrics():
return Response(generate_latest(), mimetype=CONTENT_TYPE_LATEST)

@app.route('/download')
def download_file():
# Simulated file download logic
ACTIVE_REQUESTS.inc()
file_size = 100 * 1024 * 1024 # 100MB
BYTES_TRANSFERRED.inc(file_size)
# ... (download logic)
ACTIVE_REQUESTS.dec()
return "Download complete"

if __name__ == '__main__':
start_http_server(8000) # Prometheus metrics endpoint
app.run(host='0.0.0.0', port=8080)
Registry Configuration : Image to be used in the Deployment
We will be using the above application in the above deployment as part of the container image described above.
1
your-registry/network-app:v6
To push the create a container and push the image in the register, We will create a DockerFile with following setup and starting building the image
1
2
3
4
5
6
7
8
9
10
11
12
# Use an official Python runtime as a parent image
FROM public.ecr.aws/docker/library/python:3.13-slim
# Set the working directory in the container
WORKDIR /usr/src/app
# Copy the current directory contents into the container at /usr/src/app
COPY . /usr/src/app
# Install any needed packages specified in requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
# Make port 8000 and 8080 available to the world outside this container
EXPOSE 8080 8000
# Run app.py when the container launches
CMD ["python", "app.py"]
In the requirements.txt, We will use the following libraries:
1
2
flask == 3.1.*
prometheus_client == 0.21.*
Once both the files are created, We will run the following command to build and push the registry
1
docker buildx build -t "your-registry/network-app:v4" --platform linux/amd64,linux/arm64 --push .
This will create an image and upload it in the registry which will be used in the deployment of this application in the Kubernetes cluster.
Prometheus Configuration
There are 2 parts of the prometheus setup that needs to be configured in this solutions.
Prometheus Rules : Prometheus rules allow you to precompute frequently needed or computationally expensive expressions and save their results as a new set of time series.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# prometheus-rules.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: network-metrics
namespace: monitoring
spec:
groups:
- name: network
rules:
- record: network_bandwidth_usage
expr: |
sum(
rate(network_bytes_sent_total[5m]) +
rate(network_bytes_received_total[5m])
) by (pod)
- record: network_active_requests
expr: network_active_requests
Prometheus adapter : The Prometheus Adapter is a bridge between Prometheus metrics and the Kubernetes custom metrics API. Key functions are translating Prometheus queries into metrics the Kubernetes API understands and allowing HPAs to use Prometheus metrics for scaling decisions.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# prometheus-adapter-values.yaml
prometheus:
url: http://prometheus-server.prometheus.svc.cluster.local
port: 80

rules:
default: false
custom:
- seriesQuery: 'network_bandwidth_usage{namespace!="",pod!=""}'
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
matches: "^(.*)$"
as: "${1}"
metricsQuery: <<.Series>>{<<.LabelMatchers>>}
- seriesQuery: 'network_bytes_total{namespace!="",pod!=""}'
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
matches: "^(.*)_total$"
as: "${1}_per_second"
metricsQuery: rate(<<.Series>>{<<.LabelMatchers>>}[1m])
Horizontal Pod Autoscaler (HPA)
This HPA configuration automates scaling of the 'network-app' deployment based on two custom metrics:
  1. Network bandwidth usage: Scales when average usage exceeds 5MB/s per pod.
  2. Active requests: Scales when average active requests exceed 3 per pod.
It maintains between 1 and 10 replicas, scaling up quickly (by 2 pods or 50%, whichever is greater, every 30 seconds) and scaling down conservatively (by 1 pod or 20%, whichever is less, every 60 seconds). It uses a short 30-second window for scale-up decisions and a longer 5-minute window for scale-down to prevent rapid fluctuations.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: network-app-hpa
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: network-app
minReplicas: 1
maxReplicas: 10
metrics:
- type: Pods
pods:
metric:
name: network_bandwidth_usage
selector:
matchLabels:
namespace: default
target:
type: AverageValue
averageValue: "5242880" # 5MB/s (adjust as needed)
- type: Pods
pods:
metric:
name: network_active_requests
selector:
matchLabels:
namespace: default
target:
type: AverageValue
averageValue: "3" # 3 active requests per pod
behavior:
scaleUp:
stabilizationWindowSeconds: 30 # Reduced to be more responsive
policies:
- type: Pods
value: 2 # Scale up by 2 pods at a time
periodSeconds: 30
- type: Percent
value: 50 # Or by 50% of current replicas
periodSeconds: 30
selectPolicy: Max
scaleDown:
stabilizationWindowSeconds: 300 # 5 minutes to prevent rapid scale down
policies:
- type: Pods
value: 1 # Scale down by 1 pod at a time
periodSeconds: 60
- type: Percent
value: 20 # Or by 20% of current replicas
periodSeconds: 60
selectPolicy: Min # Use the more conservative scaling option
Overview :
This solution enables Kubernetes to autoscale pods based on network bandwidth usage by:
  • Exporting custom metrics from the application.
  • Using Prometheus to collect and process these metrics.
  • Configuring Prometheus Adapter to make the metrics available to Kubernetes.
  • Scale up Pods on Network intensive nodes as defined in the karpenter nodepool
  • Image not found
  • Setting up an HPA to use these custom metrics for scaling decisions.
  • Image not found
By implementing this system, the customer can effectively scale their application based on network bandwidth usage, complementing the standard CPU and memory-based scaling capabilities of Kubernetes.
Do you want to get in touch?
Are you building tools or containerized systems? Are you a startup founder and want to discuss your startup with AWS startup experts and the authors of this article? Book your 1:1 meeting here!

Authors

Aritra Nag- Specialist Solutions Architect, AWS Migrations and Modernization.
Gokhan Kurt - Sr Global Solutions Architect, AWS Automotive.
 

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

3 Comments

Log in to comment