rate-limiting
Installation
SKILL.md
API Rate Limiting
Overview
Rate limiting controls the number of requests a client can make within a time window, protecting APIs from abuse, ensuring fair usage, and preventing backend overload. This skill should be invoked when protecting APIs from abuse, ensuring fair usage among clients, or preventing backend services from being overwhelmed.
Core Principles
- Algorithm Selection: Choose appropriate algorithm (token bucket, sliding window, fixed window)
- Granularity: Rate limit by user, IP, API key, or endpoint
- Graceful Degradation: Return proper HTTP status codes (429) with retry information
- Transparency: Include rate limit headers in responses
Preparation Checklist
- Identify rate limit dimensions (per user, per IP, per endpoint)
- Choose storage backend (Redis for distributed, in-memory for single instance)
- Define rate limit parameters (requests per minute/hour)
- Plan response format for throttled requests