
Thursday Jan 08, 2026
HolmesGPT: AI Root Cause Analysis for Kubernetes
Deep dive into HolmesGPT, the CNCF Sandbox AI agent that revolutionizes cloud-native troubleshooting. This episode covers what it is, its 40+ integrations, the project roadmap, and how to set it up today.
News Segment:
- AirFrance-KLM's secure automation platform with Terraform, Vault, and Ansible
- AWS ECS tmpfs mounts on Fargate for secure secrets handling
- Qwen 30B running on Raspberry Pi - democratizing edge AI
- AWS European Sovereign Cloud with independent EU governance
Main Topic - HolmesGPT:
- CNCF Sandbox project (accepted October 2025) with 1,600+ GitHub stars
- Agentic architecture: creates investigation task lists, queries systems, synthesizes findings
- 40+ built-in toolsets: Prometheus, Grafana Loki/Tempo, Kubernetes, ArgoCD, DataDog, and more
- Privacy-first: bring your own LLM keys, read-only access, respects RBAC
- End-to-end automation with AlertManager, PagerDuty, OpsGenie integration
- Installation options: pip, Homebrew, Helm, Web UI, K9s plugin
Resources:
Episode Type: full Episode Number: 83 Season: 1 Tags: HolmesGPT, CNCF, Kubernetes, root cause analysis, AI ops, troubleshooting, observability, SRE, platform engineering, Robusta, agentic AI
No comments yet. Be the first to say something!