
KAAR AI-Powered Kubernetes Cluster Analysis and Remediation
KAAR (Kubernetes AI-powered Analysis and Remediation), a tool that automates Kubernetes Pod issue detection and resolution using k8sgpt and AWS Bedrock.
Published May 22, 2025
KAAR (Kubernetes AI-powered Analysis and Remediation), a tool that automates Kubernetes Pod issue detection and resolution using k8sgpt and AWS Bedrock.
In this post, I’ll introduce KAAR, explain its value, and guide you on getting started with it.
Pods are the core of Kubernetes, but they often encounter issues that halt your applications:
- ImagePullBackOff: Pods fail to start due to invalid or inaccessible container images.
- CrashLoopBackOff: Containers crash repeatedly from misconfigured commands or errors like “executable file not found.”
- OOMKilled: Pods are terminated for exceeding memory limits.
- Pending: Pods can’t schedule due to resource constraints or node affinity issues.
Manually debugging these with
kubectl describe
and logs
is time-consuming, especially in large clusters.KAAR automates this process, leveraging AI to resolve Pod issues quickly and reliably, saving valuable time for DevOps teams.
KAAR is an open-source tool designed to simplify Kubernetes Pod management. It integrates:
- k8sgpt: A diagnostic tool that scans Kubernetes clusters for Pod issues.
- AWS Bedrock: Uses the Claude v2 model to classify issues and recommend fixes.
- AWS SNS and CloudWatch: Sends notifications and logs results for team visibility.
- kubectl: Applies automated fixes to get Pods back on track.
KAAR is perfect for DevOps engineers, SREs, and architects who want to minimize manual intervention. It currently focuses on Pod remediation, with plans to support Services and Deployments in future releases.
KAAR follows a streamlined, AI-powered workflow to fix Pod issues:
- Cluster Scanning: KAAR uses k8sgpt to analyze your cluster and identify Pod issues.
- Issue Classification: Findings are processed by AWS Bedrock to determine issue type.
- Remediation: Applies fixes via
kubectl
, like updating images or adjusting limits. - Verification: Ensures Pods are Running and healthy.
- Notification: Logs results to CloudWatch and sends SNS notifications.
A Pod
nginx-pod
is stuck due to an invalid command. KAAR:- Detects the issue with k8sgpt.
- Uses Bedrock to classify it as CrashLoopBackOff.
- Updates the Pod’s command.
- Verifies it is Running.
- Sends an SNS alert:> “Pod nginx-pod in default is Healthy.”
A Pod
memory-hog
is terminated for low memory. KAAR:- Identifies the OOMKilled issue.
- Increases the memory limit.
- Confirms stability.
- Sends an alert:> “Pod memory-hog in default is Healthy.”
Python 3.11 with boto3 and pyyaml:
KAAR offers compelling benefits for Kubernetes admins:
- Time Savings: Automates remediation.
- Accuracy: AI-driven classification and suggestions.
- AWS Integration: Works with Bedrock, SNS, CloudWatch.
- Future-Ready: Coming support for Services and Deployments.
As an AWS Community Builder, I built KAAR to reflect AWS’s commitment to automation and innovation.
Ensure you have:
Kubernetes Cluster: AWS EKS, minikube, or kind.
AWS Credentials:
Python 3.11 with boto3 and pyyaml:
Install KAAR via pip:
Create my_config.yaml:
KAAR is poised to become a comprehensive Kubernetes management tool. Planned enhancements include:
- Service Remediation: Support for issues like SelectorMismatch and LoadBalancerPending.
- Deployment Support: Fixes for Deployment misconfigurations.
- EventBridge Integration: Scheduled runs for proactive monitoring, inspired by my KAAR project.
- Local LLMs: Support for Ollama as an alternative to Bedrock.