Identifying Proxy Metrics for Sustainability Optimization
Learn how to extract, baseline and track proxy metrics using a sample architecture for sustainability optimization
- AWS Trusted Advisor (TA) provides actionable proxy metrics (and optimization recommendations) by analyzing usage and configuration of resources in your AWS account by using checks.
- AWS Cost and Usage Reports (CUR) contain line items for each unique combination of AWS products, usage type, and operation that you use in your AWS account. You can use Amazon Athena to aggregate usage data available in AWS CUR to identify proxy metrics for optimization.
- Amazon CloudWatch collects and track metrics, which are variables you can measure for your resources and applications.
- receive operational data from third party providers once a day
- in-house developed legacy code running on a compute cluster for ingested data processing every night
- processed data results stored in a database, and emailed to corporate analysis/specialists next day morning
- results are also stored in object storage from where data scientists in corporate office download, and build complex data models using high performance desktops
- historical data (stored in database) is accessed when needed
CPUUtilization
metric for EC2 instance can be used to find out percentage of physical CPU time that Amazon EC2 uses to run the EC2 instance, which includes time spent to run both the user code and the Amazon EC2 code.CPUUtilization
of all four EC2 instances over a period of time by either using CloudWatch Metrics Insights or ingest CloudWatch metrics into Amazon S3 using metric stream for further analysis. For quick CPU utilization summary and visualization, you can use CloudWatch Metrics Insights console and query metrics data using SQL query engine. Currently, you can query only the most recent three hours of metrics data. For preceding sample architecture optimization, we would like to calculate utilization over 30 days period. AWS CloudWatch metric streams can continuously stream metrics to supported destination, including Amazon S3 and third-party service provider destinations. Let’s review steps involved for calculating CPUUtilization
of four EC2 instances using metric streams:- In CloudWatch console, create a new stream with AWS/EC2 namespace and select
CPUUtilization
metric
- Select Quick S3 setup and let CloudWatch create required resources (Kinesis Firehose stream, S3 bucket, IAM role etc.) to emit the metrics in JSON format
- Once you are finished creating the metric stream, CloudWatch will then automatically start directing the EC2 instances metrics to Amazon Kinesis Firehouse delivery stream, with delivery to a data lake in S3
- You can then use Amazon Athena to query metric data stored in S3. Refer to this user guide for creating table using AWS Glue for metrics data stored in S3, and access it in Athena for querying
- Once you have setup Athena to query the S3 database, you can run SQL queries to filter out and aggregate four EC2 instances metric data to calculate average
CPUUtilization
by dividingsum
withsample count
statistics - Dive-deep into JSON data formatting and extraction, SQL queries to aggregate and process data to calculate average utilization, is beyond the scope of this article
- average vCPU utilization of all instances (proxy metric) / number of processed transactions (business metric)
- total vCPU hours used of all instances (proxy metric) / number of processed transactions (business metric)
- total S3 data (GB/TB) stored (proxy metric) / Number of analyses performed (business metric).
BucketSizeBytes
metric can also be used to determine amount of data stored across storage-tiers of S3 bucket, and then you can define S3 Lifecycle based objects transition to other storage class. You can also use S3 Storage Lens which delivers organization-wide visibility into object storage usage, activity trends, and makes actionable recommendations. Implementing the above mentioned changes aligns with the best practice outlined in the AWS Well Architected Sustainability Pillar to remove redundant data, and using policies to manage the lifecycle of your data.Amazon RDS Idle DB Instances
can be used to find out if the RDS MySQL database (used for storing summary data in preceding architecture) is actively used by Specialists/Analysts or not.Amazon RDS Idle DB Instances
check, AnyCompany can identify if RDS instance is not accessed (inactive) for more than 7 days, and plan for optimization.Amazon RDS Idle DB Instances
check, AnyCompany can identify if RDS instance is not accessed (inactive) for up to 14 or more days, and plan for optimization. As this proxy metric is already normalized by highlighting no business outcome (data processed) in reported number of days, we don’t have to identify any other business metric for measuring optimization over period of time. After implementing the optimization changes, we can review the same proxy metric to compare improvement.CPUUtilization
metric. Refer to AWS WA Sustainability Pillar best practice [SUS05-BP01 Use the minimum amount of hardware to meet your needs] and [SUS02-BP01 Scale workload infrastructure dynamically].- Query Amazon S3 data using AWS Athena (for processing AWS CloudWatch metrics stream use case)
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.