How to use CloudTrail to analyse your CloudWatch API Usage
The new CloudWatch API with Data CloudTrail helps you to identify biggest spenders
Steph Gooch
Amazon Employee
Published Jul 9, 2024
Last Modified Jul 25, 2024
Are your Amazon CloudWatch costs spiralling out of control? Do you feel like you're flying blind when it comes to API usage? The latest episode of Keys to AWS Optimization has just dropped new tips for AWS users everywhere. Prepare to start optimizing your CloudWatch spend and improving your observability efficiency.
The Keys to AWS Optimization show recently featured an exciting episode on how Amazon CloudWatch GetMetricData API now supports AWS CloudTrail data event logging.
Amazon CloudWatch offers several billable data plane APIs that are charged to the user on a per-usage basis. Stay on top of your charges and identify opportunities for cost optimization by analyzing AWS CloudTrail logging of those APIs usage. Learn how to log those data plane events in CloudTrail, and gain visibility into the usage patterns of your CloudWatch GetMetricData APIs with Amazon CloudWatch Log insights and/or Amazon Athena queries.
Here’s what was covered, in case you missed it:
AWS has announced a new feature allowing customers to analyze CloudWatch API usage through CloudTrail.
"Prior to the launch of API in CloudTrail, customers would have to contact support to find the top contributors, who the caller is, which user, which IP address the calls are coming from. Now customers can find it themselves." - Chaitanya Gummadi
To use this feature, you need to set up CloudTrail with data events enabled.
"We provisioned and created the trail in CloudTrail who is going to record the CloudWatch activities and we were sending the results of this activity of CloudTrail back into CloudWatch log." - Benjamin Lecoq
You can analyze the data using either CloudWatch Logs Insights or Athena. Each has its benefits:
"If you're comfortable with CloudWatch Log Insight queries, use the power of AppInsights. If not, you could go straight to Athena," - Chaitanya Gummadi
Athena Table creation:
CREATE EXTERNAL TABLE cloudtrail_logs_all_accounts(
eventVersion STRING,
userIdentity STRUCT<
type: STRING,
principalId: STRING,
arn: STRING,
accountId: STRING,
invokedBy: STRING,
accessKeyId: STRING,
userName: STRING,
sessionContext: STRUCT<
attributes: STRUCT<
mfaAuthenticated: STRING,
creationDate: STRING>,
sessionIssuer: STRUCT<
type: STRING,
principalId: STRING,
arn: STRING,
accountId: STRING,
userName: STRING>,
ec2RoleDelivery:string,
webIdFederationData: STRUCT<
federatedProvider: STRING,
attributes: map<string,string>>
>
>,
eventTime STRING,
eventSource STRING,
eventName STRING,
awsRegion STRING,
sourceIpAddress STRING,
userAgent STRING,
errorCode STRING,
errorMessage STRING,
requestparameters STRING,
responseelements STRING,
additionaleventdata STRING,
requestId STRING,
eventId STRING,
readOnly STRING,
resources ARRAY<STRUCT<
arn: STRING,
accountId: STRING,
type: STRING>>,
eventType STRING,
apiVersion STRING,
recipientAccountId STRING,
serviceEventDetails STRING,
sharedEventID STRING,
vpcendpointid STRING,
tlsDetails struct<
tlsVersion:string,
cipherSuite:string,
clientProvidedHostHeader:string>
)
PARTITIONED BY (
`timestamp` string,
`region` string,
`accountid` string)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
STORED AS INPUTFORMAT 'com.amazon.emr.cloudtrail.CloudTrailInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
's3://<BUCKETNAME>/AWSLogs/<FOLDER>'
TBLPROPERTIES (
'storage.location.template'='s3://<BUCKETNAME>/AWSLogs/<FOLDER>/${accountid}/CloudTrail/${region}/${timestamp}',
'projection.enabled'='true',
'projection.timestamp.type'='date',
'projection.timestamp.format'='yyyy/MM/dd',
'projection.timestamp.interval'='1',
'projection.timestamp.interval.unit'='DAYS',
'projection.timestamp.range'='2024/07/01,NOW',
'projection.accountid.type'='enum',
'projection.accountid.values'='533267304144,720070302599,716763681054',
'projection.region.type'='enum',
'projection.region.values'='us-east-1,us-east-2'
)
Athena Table creation:
CREATE EXTERNAL TABLE cloudtrail_logs_all_accounts(
eventVersion STRING,
userIdentity STRUCT<
type: STRING,
principalId: STRING,
arn: STRING,
accountId: STRING,
invokedBy: STRING,
accessKeyId: STRING,
userName: STRING,
sessionContext: STRUCT<
attributes: STRUCT<
mfaAuthenticated: STRING,
creationDate: STRING>,
sessionIssuer: STRUCT<
type: STRING,
principalId: STRING,
arn: STRING,
accountId: STRING,
userName: STRING>,
ec2RoleDelivery:string,
webIdFederationData: STRUCT<
federatedProvider: STRING,
attributes: map<string,string>>
>
>,
eventTime STRING,
eventSource STRING,
eventName STRING,
awsRegion STRING,
sourceIpAddress STRING,
userAgent STRING,
errorCode STRING,
errorMessage STRING,
requestparameters STRING,
responseelements STRING,
additionaleventdata STRING,
requestId STRING,
eventId STRING,
readOnly STRING,
resources ARRAY<STRUCT<
arn: STRING,
accountId: STRING,
type: STRING>>,
eventType STRING,
apiVersion STRING,
recipientAccountId STRING,
serviceEventDetails STRING,
sharedEventID STRING,
vpcendpointid STRING,
tlsDetails struct<
tlsVersion:string,
cipherSuite:string,
clientProvidedHostHeader:string>
)
PARTITIONED BY (
`timestamp` string,
`region` string,
`accountid` string)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
STORED AS INPUTFORMAT 'com.amazon.emr.cloudtrail.CloudTrailInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
's3://<BUCKETNAME>/AWSLogs/<FOLDER>'
TBLPROPERTIES (
'storage.location.template'='s3://<BUCKETNAME>/AWSLogs/<FOLDER>/${accountid}/CloudTrail/${region}/${timestamp}',
'projection.enabled'='true',
'projection.timestamp.type'='date',
'projection.timestamp.format'='yyyy/MM/dd',
'projection.timestamp.interval'='1',
'projection.timestamp.interval.unit'='DAYS',
'projection.timestamp.range'='2024/07/01,NOW',
'projection.accountid.type'='enum',
'projection.accountid.values'='533267304144,720070302599,716763681054',
'projection.region.type'='enum',
'projection.region.values'='us-east-1,us-east-2'
)
GROUP BY useridentity.arn,
eventname,
sourceipaddress,
recipientAccountId,
awsRegion, userAgent
ORDER BY count DESC
Do you know how you’re using CloudWatch API calls today? Validating with these visuals can help identify changes in usage patterns that lead to optimizations or efficiency improvements.
"This is gonna tell me 98% of my API calls in my account are from GetMetricData." - Chaitanya Gummadi
Aside from the new CloudTrail options, don’t forget that other native tools like AWS Cost Explorer can help you easily identify changing patterns or unintended usage.
"The typical tool and the simplest and most straightforward is Cost Explorer. It's sometimes underestimated, but Cost Explorer is super powerful and you can get a lot of insights." - Benjamin Lecoq
Want to learn more about optimizing your CloudWatch costs? Watch the full episode recording for in-depth demonstrations and expert insights!
Remember, understanding and optimizing your CloudWatch usage can lead to significant cost savings. Start analyzing your usage today! Join us live on https://twitch.tv/aws 11:00AM EST every Thursday.
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.