Wednesday Dec 10, 2025

AWS re:Invent 2025 Recap Part 3/4 - EKS & Cloud Operations

Part 3 of our AWS re:Invent 2025 series. AWS transforms Kubernetes into an AI infrastructure platform with massive scale and AI-native operations.

In this episode:
- EKS Ultra Scale: 100,000 nodes per cluster (vs 15K GKE, 5K AKS)—1.6 million Trainium accelerators or 800K GPUs in a single cluster
- AWS replaced etcd's Raft consensus with their internal "journal" system and moved to in-memory storage for 500 pods/sec at 100K scale
- Anthropic using EKS Ultra Scale for Claude training, improving latency KPIs from 35% to 90%+
- EKS Capabilities: Fully managed Argo CD, AWS Controllers for Kubernetes (200+ CRDs for 50+ services), Kube Resource Orchestrator
- EKS MCP Server: Natural language Kubernetes management—"show me all pods not running" instead of kubectl
- EKS Provisioned Control Plane: XL/2XL/4XL tiers ($1.65-$6.90/hr), 4XL supports 40K nodes
- CloudWatch Gen AI Observability: LangChain, LangGraph, CrewAI agent tracing
- DevOps Agent (Preview): Autonomous on-call engineer—Kindle saw 80% time savings
- CloudWatch unified data store with S3 Tables, OCSF, Apache Iceberg

📰 News Segment Links:
• cert-manager v1.19.2 CVE Patches (CVE-2025-61727, CVE-2025-61729)
  https://github.com/cert-manager/cert-manager/releases/tag/v1.19.2
• cert-manager v1.18.4 Backport
  https://github.com/cert-manager/cert-manager/releases/tag/v1.18.4
• Canonical Extends Kubernetes Long-Term Support to 15 Years
  https://thenewstack.io/canonical-extends-kubernetes-long-term-support-to-15-years/
• OpenTofu 1.11 with Ephemeral Resources
  https://github.com/opentofu/opentofu/releases/tag/v1.11.0
• Cloudflare Shift-Left Enterprise IaC
  https://blog.cloudflare.com/shift-left-enterprise-scale/

🔗 Main Content Sources:
• EKS Ultra Scale 100K Nodes
  https://aws.amazon.com/blogs/containers/amazon-eks-enables-ultra-scale-ai-ml-workloads-with-support-for-100k-nodes-per-cluster/
• Under the Hood: EKS Ultra Scale
  https://aws.amazon.com/blogs/containers/under-the-hood-amazon-eks-ultra-scale-clusters/
• EKS Capabilities Announcement
  https://aws.amazon.com/blogs/aws/announcing-amazon-eks-capabilities-for-workload-orchestration-and-cloud-resource-management/
• EKS MCP Server
  https://aws.amazon.com/blogs/containers/introducing-the-fully-managed-amazon-eks-mcp-server-preview/
• EKS Provisioned Control Plane
  https://aws.amazon.com/blogs/containers/amazon-eks-introduces-provisioned-control-plane/
• Cloud Operations Top 10 Announcements
  https://aws.amazon.com/blogs/mt/2025-top-10-announcements-for-aws-cloud-operations/
• AI-driven Operations at re:Invent
  https://aws.amazon.com/blogs/mt/embracing-ai-driven-operations-and-observability-at-reinvent-2025/

Perfect for platform engineers, SREs, DevOps engineers, and cloud architects looking to level up their platform engineering skills.

Episode URL: https://platformengineering.org/podcasts/00051-aws-reinvent-2025-eks-cloud-operations

Series: AWS re:Invent 2025 (Part 3 of 4)

Episode URL: https://platformengineeringplaybook.com/podcasts/00051-aws-reinvent-2025-eks-cloud-operations

Part 1: The Agentic AI Revolution - https://platformengineeringplaybook.com/podcasts/00049-aws-reinvent-2025-agentic-ai-revolution
Part 2: Infrastructure & Developer Experience - https://platformengineeringplaybook.com/podcasts/00050-aws-reinvent-2025-infrastructure-developer-experience

Category: Technology
Subcategory: Software How-To

Keywords: AWS, re:Invent 2025, EKS, Kubernetes, EKS Ultra Scale, EKS Capabilities, Argo CD, ACK, MCP Server, CloudWatch, DevOps Agent, AIOps, platform engineering

Comment (0)

No comments yet. Be the first to say something!

Copyright 2025 All rights reserved.

Podcast Powered By Podbean

Version: 20241125