staff-engineering-skills-retry-storms
Installation
SKILL.md
Retry Storms Trap
The service is slow. Your retries made it slower. Before adding retry logic, ask: what happens when every client retries at the same time against an already-struggling service?
The Feedback Loop
Retry storms are a positive feedback loop: worse performance draws more retries, which worsens performance, which draws more retries. The only stable states are "working fine" and "completely dead."
Service slow → clients time out → retry → 2x load → slower
→ more timeouts → more retries → 4x load → collapse
The system can't recover under load because retries prevent the load from decreasing.
The Layer Multiplication Problem
Three layers of 3 retries = 27 backend calls for one user request.