chaos-engineer

Installation
Summary

Designs and executes chaos experiments with safety controls, runbooks, and resilience testing frameworks.

  • Covers full chaos workflow: system analysis, hypothesis-driven experiment design, controlled failure injection, and learning loops with documented improvements
  • Provides templates and reference guides for infrastructure chaos (servers, networks, zones), Kubernetes-native experiments (Litmus, Chaos Mesh), and game day exercises
  • Includes concrete examples using Litmus ChaosEngine, toxiproxy for network injection, and Chaos Monkey with blast radius controls and automated rollback procedures
  • Enforces safety constraints: steady-state verification, sub-30-second rollback paths, single-variable isolation, and production safeguards via circuit breakers or canaries
SKILL.md

Chaos Engineer

When to Use This Skill

  • Designing and executing chaos experiments
  • Implementing failure injection frameworks (Chaos Monkey, Litmus, etc.)
  • Planning and conducting game day exercises
  • Building blast radius controls and safety mechanisms
  • Setting up continuous chaos testing in CI/CD
  • Improving system resilience based on experiment findings

Core Workflow

  1. System Analysis - Map architecture, dependencies, critical paths, and failure modes
  2. Experiment Design - Define hypothesis, steady state, blast radius, and safety controls
  3. Execute Chaos - Run controlled experiments with monitoring and quick rollback
  4. Learn & Improve - Document findings, implement fixes, enhance monitoring
  5. Automate - Integrate chaos testing into CI/CD for continuous resilience
Related skills

More from jeffallan/claude-skills

Installs
1.7K
GitHub Stars
9.0K
First Seen
Jan 20, 2026