prompt-caching

Installation
SKILL.md

Prompt Caching

What Gets Cached

Prompt caching = KV caching. The provider stores K (key) and V (value) matrices from the attention mechanism between requests. When a new request shares a prefix with a cached prompt, the provider reuses stored matrices instead of recomputing them:

  • Up to 90% cost reduction on cached input tokens
  • Up to 85% latency reduction (time-to-first-token) for long prompts

Cache matching is prefix-based -- partial matches work. Temperature, top_p, top_k do not affect caching (they act after attention).

Provider Comparison

Related skills

More from maxmurr/skills

Installs
2
Repository
maxmurr/skills
GitHub Stars
1
First Seen
Apr 27, 2026