sre-incident-response
Installation
SKILL.md
SRE Incident Response
Managing incidents and conducting effective postmortems.
Incident Severity Levels
P0 - Critical
- Impact: Service completely down or major functionality unavailable
- Response: Immediate, all-hands
- Communication: Every 30 minutes
- Examples: Complete outage, data loss, security breach
P1 - High
- Impact: Significant degradation affecting many users
- Response: Immediate, primary on-call
- Communication: Every hour
- Examples: Elevated error rates, slow response times