Easily Monitor Containerized Applications with Amazon CloudWatch Container Insights
How to collect, aggregate, and analyze metrics from your containerized applications using Amazon CloudWatch Container Insights.
Step 1: Set up Container Insights on Amazon EKS
Step 2: Deploy a Container Application in the Cluster
Step 3: Use CloudWatch Logs Insights Query to search and analyze container logs
To Run a CloudWatch Logs Insights Sample Query:
Step 4: Monitor Performance of the Application with Container Insights
View Container Insights Dashboard Metrics
View Additional Amazon EKS and Kubernetes Container Insights Metrics
Note: If you're within your inaugural 12-month phase, be advised that Amazon CloudWatch Container Insights falls outside the AWS free tier, hence usage could result in additional charges.
About | |
---|---|
✅ AWS experience | 200 - Intermediate |
⏱ Time to complete | 30 minutes |
🧩 Prerequisites | - AWS Account |
📢 Feedback | Any feedback, issues, or just a 👍 / 👎 ? |
⏰ Last Updated | 2023-10-02 |
- Install the latest version of kubectl. To check your version, run:
kubectl version --short
. - Install the latest version of eksctl. To check your version, run:
eksctl info
.
ClusterName
and RegionName
. In the following example, my-cluster
is the name of your Amazon EKS cluster, and us-east-2
is the region where the logs are published. You should replace these values with your own values. It's advisable to specify the same region where your cluster is located to minimize AWS outbound data transfer costs. Additionally, FluentBitHttpPort
is given a value of '2020' because this port is commonly used for monitoring purposes and allows for integration with existing tools, and FluentBitReadFromHead
is given a value of 'Off' to ensure that the logs are read from the end, not the beginning, which can be essential for managing large log files and optimizing performance.1
2
3
4
export ClusterName=managednodes-quickstart
export LogRegion=us-east-2
export FluentBitHttpPort='2020'
export FluentBitReadFromHead='Off'
1
[[ ${FluentBitReadFromHead} = 'On' ]] && FluentBitReadFromTail='Off'|| FluentBitReadFromTail='On'
FluentBitHttpServer
for monitoring plugin metrics is on by default.1
[[ -z ${FluentBitHttpPort} ]] && FluentBitHttpServer='Off' || FluentBitHttpServer='On'
1
curl https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/quickstart/cwagent-fluent-bit-quickstart.yaml | sed 's/{{cluster_name}}/'${ClusterName}'/;s/{{region_name}}/'${LogRegion}'/;s/{{http_server_toggle}}/"'${FluentBitHttpServer}'"/;s/{{http_server_port}}/"'${FluentBitHttpPort}'"/;s/{{read_from_head}}/"'${FluentBitReadFromHead}'"/;s/{{read_from_tail}}/"'${FluentBitReadFromTail}'"/' > cwagent-fluent-bit-quickstart.yaml
1
2
3
4
5
6
eksctl create iamserviceaccount --name fluent-bit \
--namespace amazon-cloudwatch \
--cluster ${ClusterName} --role-name fluent-bit \
--attach-policy-arn arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy \
--approve --region ${LogRegion} \
--override-existing-serviceaccounts
1
kubectl apply -f cwagent-fluent-bit-quickstart.yaml
1
kubectl get pods -n amazon-cloudwatch
workload.yaml
.- Create a Kubernetes manifest called workload.yaml and paste the following contents into it.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
apiVersion: v1
kind: Namespace
metadata:
labels:
kubernetes.io/metadata.name: quickstart
name: quickstart
apiVersion: apps/v1
kind: Deployment
metadata:
name: "quickstart-nginx-deployment"
namespace: "quickstart"
spec:
selector:
matchLabels:
app: "quickstart-nginx"
replicas: 3
template:
metadata:
labels:
app: "quickstart-nginx"
role: "backend"
spec:
dnsPolicy: Default
enableServiceLinks: false
automountServiceAccountToken: false
securityContext:
seccompProfile:
type: RuntimeDefault
containers:
- image: public.ecr.aws/nginx/nginx:latest
imagePullPolicy: Always
name: "quickstart-nginx"
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
ports:
- containerPort: 80
command: ["/bin/sh"]
args: ["-c", "echo PodName: $MY_POD_NAME NodeName: $MY_NODE_NAME podIP: $MY_POD_IP> /usr/share/nginx/html/index.html && exec nginx -g 'daemon off;'"]
env:
- name: MY_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: MY_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
volumeMounts:
- name: cache
mountPath: /var/cache/nginx
- name: usr
mountPath: /var/run
- name: tmp
mountPath: /usr/share/nginx/html
volumes:
- name: cache
emptyDir: {}
- name: tmp
emptyDir: {}
- name: usr
emptyDir: {}
apiVersion: v1
kind: Service
metadata:
name: quickstart-nginx-service
namespace: quickstart
spec:
type: NodePort
selector:
app: "quickstart-nginx"
role: "backend"
ports:
- port: 80
targetPort: 80
apiVersion: v1
kind: Pod
metadata:
name: load
namespace: quickstart
spec:
securityContext:
seccompProfile:
type: RuntimeDefault
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 1000
automountServiceAccountToken: false
containers:
- name: load
image: public.ecr.aws/docker/library/busybox:1.36.1
imagePullPolicy: Always
command: ["/bin/sh"]
args: ["-c", "while sleep 0.5; do wget -q -O- http://quickstart-nginx-service; done"]
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
- Deploy the Kubernetes resources in
workload.yaml
.
1
kubectl apply -f workload.yaml
1
2
3
4
namespace/quickstart created
deployment.apps/quickstart-nginx-deployment created
service/quickstart-nginx-service created
pod/load created
- Use the following command to check the status of the deployed Nginx containers and ensure that they are running:
1
kubectl get all -n quickstart
1
2
3
4
5
6
7
8
9
10
11
12
13
14
NAME READY STATUS RESTARTS AGE
pod/load 1/1 Running 0 15s
pod/quickstart-nginx-deployment-7cd757dc7b-9fss6 1/1 Running 0 16s
pod/quickstart-nginx-deployment-7cd757dc7b-fv592 1/1 Running 0 16s
pod/quickstart-nginx-deployment-7cd757dc7b-wpw4x 1/1 Running 0 16s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/quickstart-nginx-service NodePort 10.100.233.21 <none> 80:31243/TCP 16s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/quickstart-nginx-deployment 3/3 3 3 17s
NAME DESIRED CURRENT READY AGE
replicaset.apps/quickstart-nginx-deployment-7cd757dc7b 3 3 3 17s
- Use the following command to view the real-time logs of the "load" Pod, which is continuously making requests to the Nginx service. Use Ctrl+C to stop.
1
kubectl logs -f load -n quickstart
1
2
3
4
PodName: quickstart-nginx-deployment-7cd757dc7b-wpw4x NodeName: ip-192-168-141-57.us-east-2.compute.internal podIP: 192.168.136.230
PodName: quickstart-nginx-deployment-7cd757dc7b-fv592 NodeName: ip-192-168-177-109.us-east-2.compute.internal podIP: 192.168.164.31
PodName: quickstart-nginx-deployment-7cd757dc7b-fv592 NodeName: ip-192-168-177-109.us-east-2.compute.internal podIP: 192.168.164.31
PodName: quickstart-nginx-deployment-7cd757dc7b-9fss6 NodeName: ip-192-168-119-7.us-east-2.compute.internal podIP: 192.168.112.25
/aws/containerinsights/Cluster_Name/application
which contains all log files in /var/log/containers
on each worker node in the cluster.- Open the CloudWatch console.
- In the navigation pane, choose Logs, and then choose Log groups.
- Click the log group
/aws/containerinsights/CLUSTER_NAME/application
. Where CLUSTER_NAME is the actual name of your EKS cluster. - Under the log details (top-right), click View in Logs Insights.
- Delete the default query in the CloudWatch Log Insight Query Editor. Then, enter the following command and select Run query:
1
2
3
4
fields @timestamp, kubernetes.pod_name as PodName, kubernetes.host as WorkerNode, kubernetes.namespace_name as Namespace, log
| filter PodName like 'quickstart-nginx-deployment'
| sort @timestamp desc
| limit 200
- Use the time interval selector to select a time period that you want to query. For example:
- Open the CloudWatch console at https://console.aws.amazon.com/cloudwatch/.
- In the left navigation pane, open the Insights dropdown menu, and then choose Container Insights.
- Under “Container Insights” (top), select Performance Monitoring from the dropdown menu.
- In the “EKS Clusters” dropdown field, select the name of your cluster.
- Use the additional dropdown menus to filter resources, such as “EKS Clusters” and “EKS Pods.” For example:
- Create a Kubernetes manifest called
geo-api.yaml
with the content below to deploy a simple backend application called geo-api with the following command:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
apiVersion: apps/v1
kind: Deployment
metadata:
name: geo-api
spec:
selector:
matchLabels:
run: geo-api
replicas: 1
template:
metadata:
labels:
run: geo-api
spec:
containers:
- name: geo-api
image: registry.k8s.io/hpa-example
ports:
- containerPort: 80
resources:
limits:
cpu: 250m
memory: "12Mi"
requests:
cpu: 125m
memory: "10Mi"
apiVersion: v1
kind: Service
metadata:
name: geo-api
labels:
run: geo-api
spec:
ports:
- port: 80
selector:
run: geo-api
- Deploy the application using the command below:
1
kubectl apply -f geo-api.yaml
- Create a load for the web server by running a container.
1
2
3
kubectl create deployment geo-api-load \
--image=busybox \
--replicas=2 -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://geo-api; done"
- Verify the Pods status:
1
kubectl get pods
1
2
3
4
5
6
7
NAME READY STATUS RESTARTS AGE
geo-api-load-c9c7bf98c-4rrn8 0/1 ContainerCreating 0 0s
geo-api-load-c9c7bf98c-kzsjs 0/1 ContainerCreating 0 0s
geo-api-load-c9c7bf98c-4rrn8 1/1 Running 0 1s
geo-api-load-c9c7bf98c-kzsjs 1/1 Running 0 2s
geo-api-76f6dcf999-ptpz5 1/1 Running 20 (5m8s ago) 118m
geo-api-76f6dcf999-ptpz5 0/1 OOMKilled 20 (5m13s ago) 118m
1
kubectl get pod geo-api-76f6dcf999-ptpz5 --output=yaml | grep -i lastState -A7
1
2
3
4
5
6
7
8
lastState:
terminated:
containerID: containerd://4bbdfee06a3d3daca0e74f14f18f8a66ac0a415c79720eae44ea9ad4c46bcb37
exitCode: 137
finishedAt: "2023-08-26T12:48:37Z"
reason: OOMKilled
startedAt: "2023-08-26T12:47:27Z"
name: geo-api
- Let’s view the Container Insights metrics of this pod:
- Open the CloudWatch console at https://console.aws.amazon.com/cloudwatch/.
- In the navigation pane, choose Metrics, and then choose All metrics.
- Select the ContainerInsights metric namespace. Select the ClusterName, Namespace, and PodName, in the search bar, copy and paste PodName="geo-api".
- You can view the percentage of CPU units being used by the pod relative to the pod limit and the percentage of memory that is being used by pods relative to the pod limit by selecting the metrics below:
pod_cpu_utilization_over_pod_limit
pod_memory_utilization_over_pod_limit
1
2
3
4
5
6
7
# Delete workloads
kubectl delete -f workload.yaml -n quickstart
kubectl delete -f geo-api.yaml
# Delete the the CloudWatch agent and Fluentbit for Container Insights
kubectl delete -f cwagent-fluent-bit-quickstart.yaml
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.