skills/smithery.ai/running-chaos-tests

running-chaos-tests

SKILL.md

Chaos Engineering Toolkit

Overview

Execute controlled chaos engineering experiments to test system resilience, fault tolerance, and recovery capabilities. Injects failures including network latency, service crashes, resource exhaustion, and dependency outages to verify that systems degrade gracefully and recover automatically.

Prerequisites

  • Distributed system or microservice architecture deployed in a staging/test environment
  • Monitoring and alerting configured (Grafana, Datadog, CloudWatch, or Prometheus)
  • Rollback capability for the target environment (manual or automated)
  • Chaos engineering tool installed (toxiproxy, Pumba, Litmus, or Chaos Mesh)
  • Explicit approval from the team to run chaos experiments
  • Steady-state hypothesis defined (what "healthy" looks like in metrics)

Instructions

Installs
2
First Seen
Mar 20, 2026