alerting-strategies
Installation
SKILL.md
Alerting Strategies
Get paged for real problems, not noise.
Alerting Philosophy
"Every alert should be actionable, and every action should have a runbook."
The Goal
- Page for symptoms (user impact), not causes (internal metrics)
- Every page should require human judgment
- False positives erode trust; false negatives cause outages
Alert Severity Levels
| Level | Response | Time to Ack | Example |
|---|---|---|---|
| P1/Critical | Page immediately | 5 min | Service down, data loss |
| P2/High | Page during hours | 30 min | Degraded performance |
Related skills