exa-load-scale
SKILL.md
Exa Load & Scale
Overview
Load testing and capacity planning for Exa integrations. Key constraint: Exa's default rate limit is 10 QPS. Scaling strategies focus on caching, request queuing, parallel processing within rate limits, and search type selection for latency budgets.
Prerequisites
- k6 load testing tool installed
- Test environment Exa API key (separate from production)
- Redis for result caching
Capacity Reference
| Search Type | Typical Latency | Max Throughput (10 QPS) |
|---|---|---|
instant |
< 150ms | 10 req/s (600/min) |
fast |
< 425ms | 10 req/s (600/min) |
auto |
300-1500ms | 10 req/s (600/min) |
neural |
500-2000ms | 10 req/s (600/min) |
deep |
2-5s | 10 req/s (600/min) |