operating-kubernetes
Kubernetes Operations
Purpose
Operating Kubernetes clusters in production requires mastery of resource management, scheduling patterns, networking architecture, storage strategies, security hardening, and autoscaling. This skill provides operations-first frameworks for right-sizing workloads, implementing high-availability patterns, securing clusters with RBAC and Pod Security Standards, and systematically troubleshooting common failures.
Use this skill when deploying applications to Kubernetes, configuring cluster resources, implementing NetworkPolicies for zero-trust security, setting up autoscaling (HPA, VPA, KEDA), managing persistent storage, or diagnosing operational issues like CrashLoopBackOff or resource exhaustion.
When to Use This Skill
Common Triggers:
- "Deploy my application to Kubernetes"
- "Configure resource requests and limits"
- "Set up autoscaling for my pods"
- "Implement NetworkPolicies for security"
- "My pod is stuck in Pending/CrashLoopBackOff"
- "Configure RBAC with least privilege"
- "Set up persistent storage for my database"
- "Spread pods across availability zones"
More from ancoleman/ai-design-components
creating-dashboards
Creates comprehensive dashboard and analytics interfaces that combine data visualization, KPI cards, real-time updates, and interactive layouts. Use this skill when building business intelligence dashboards, monitoring systems, executive reports, or any interface that requires multiple coordinated data displays with filters, metrics, and visualizations working together.
245implementing-drag-drop
Implements drag-and-drop and sortable interfaces with React/TypeScript including kanban boards, sortable lists, file uploads, and reorderable grids. Use when building interactive UIs requiring direct manipulation, spatial organization, or touch-friendly reordering.
164administering-linux
Manage Linux systems covering systemd services, process management, filesystems, networking, performance tuning, and troubleshooting. Use when deploying applications, optimizing server performance, diagnosing production issues, or managing users and security on Linux servers.
127security-hardening
Reduces attack surface across OS, container, cloud, network, and database layers using CIS Benchmarks and zero-trust principles. Use when hardening production infrastructure, meeting compliance requirements, or implementing defense-in-depth security.
109building-ai-chat
Builds AI chat interfaces and conversational UI with streaming responses, context management, and multi-modal support. Use when creating ChatGPT-style interfaces, AI assistants, code copilots, or conversational agents. Handles streaming text, token limits, regeneration, feedback loops, tool usage visualization, and AI-specific error patterns. Provides battle-tested components from leading AI products with accessibility and performance built in.
74designing-distributed-systems
When designing distributed systems for scalability, reliability, and consistency. Covers CAP/PACELC theorems, consistency models (strong, eventual, causal), replication patterns (leader-follower, multi-leader, leaderless), partitioning strategies (hash, range, geographic), transaction patterns (saga, event sourcing, CQRS), resilience patterns (circuit breaker, bulkhead), service discovery, and caching strategies for building fault-tolerant distributed architectures.
52