incident-response
Installation
SKILL.md
Incident Response
Structured incident management from detection through postmortem, with resilience patterns for preventing and containing cascading failures.
When to Use
- Production incident in progress (outage, degradation, data loss)
- Designing circuit breakers, bulkheads, or fallback strategies
- Conducting or planning chaos engineering exercises
- Writing or reviewing postmortem documents
- Establishing on-call procedures and escalation paths
Avoid when:
- The issue is a development-time bug with no production impact
- Designing general system architecture (use system-design instead)