Automate Monitoring of Available IPs in AWS VPC Subnets
Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.
Published Nov 21, 2024
I want to begin with saying that Amazon Q developer and AWS Infrastructure Composer helped me to design this solution in a matter of minutes.
Let's discuss the problem I'm attempting to tackle. IP exhaustion, which occurs when given subnets run out of IPs, is a problem that may arise if you are using Amazon EKS and your workload is growing.
Unless you have IPAM, AWS Cloudwatch metrics do not support them at the time I am writing this blog. Monitoring your available IP addresses in subnets without the use of IPAM is what I'm attempting to accomplish here.
- AWS Lambda
- Event Bridge Scheduler
- AWS Cloudwatch Metrics
- AWS Cloudwatch Alarm
- AWS SNS
I was able to create this in a matter of minutes with the help of Amazon Q Developer, however, I obviously needed to make a few little adjustments. This is very beneficial if you understand the basics and what you are doing. Instead of configuring AWS services blindly, I recommend everyone to better understand AWS services.
This further enables you design your infrastructure visually, generate Infrastructure as Code and deploy it using AWS SAM (AWS Serverless Application Model) https://aws.amazon.com/serverless/sam/.
- AWS CLI installed and configured with appropriate permissions
- AWS Toolkit for Visual Studio Code installed and configured
- AWS SAM CLI installed
Repository for entire code and instructions on how to deploy: https://github.com/awsfanboy/aws-subnet-ip-address-utilization-monitor
Repository for entire code and instructions on how to deploy: https://github.com/awsfanboy/aws-subnet-ip-address-utilization-monitor
- Modify the
template.yaml
file to adjust default parameter values or add/remove resources as needed. eg:VPC ID
,Subnet Name
,Subnet ID
,CloudWatch Metric Namespace
. - (Optional) Update the
lambda_function.py
file in the src directory. - Build the SAM application:
sam build
- Deploy the SAM application:
sam deploy --guided
- This will start an interactive deployment process. You'll be prompted to provide values for the parameters defined in the template. You can accept the default values or provide your own.
- During the deployment, you'll be asked to confirm the creation of IAM roles and the changes to be applied. Review and confirm these.
- SAM will output the ARNs of the created Lambda function and SNS topic once the deployment is complete.
- Lambda function for monitoring subnets
- EventBridge rule to trigger the Lambda function every minute
- SNS topic for sending alerts
- CloudWatch alarms for each monitored subnet
- To monitor more than two subnets, duplicate the
SubnetUtilizationAlarm
resource in the template and adjust theSubnetIds
parameter. - Modify the Lambda function code in
src/lambda_function.py
to implement your specific monitoring logic. - Adjust the alarm thresholds and evaluation periods in the
SubnetUtilizationAlarm
resources as needed.
- To remove all resources created by this stack:
sam delete
- Follow the prompts to confirm the deletion of resources.
I have an Amazon EKS cluster running a deployment with 6 replicas. Worker nodes are running on 2 Subnets. IP address utilization is looking good.
The alarm state is OK.
Okay! let's increase the number of replicas from
6
to 600
.Let's check metrics from the CloudWatch and ooops! now we can see that IP utilization is high.
Now, let's check the Alarms in the CloudWatch. Now the state changed from
OK
to ALARM
state.Let's check my emails
I can see there are 2 emails in my inbox.
I calculated the cost using calculator.aws, and it appears to be not bad though.
These notifications can be sent to Slack, PagerDuty, and other platforms.
I hope my automation will help someone who doesn't want to use IPAM to monitor IP address utilization in subnets, and I truly wish we could access these metrics straight from CloudWatch.
If you have any suggestions for improvement or if you would like to use anything you currently have in a different way, please feel free to share.