semantic-caching
Installation
SKILL.md
Semantic Caching
Cache LLM responses by semantic similarity.
Cache Hierarchy
Request → L1 (Exact) → L2 (Semantic) → L3 (Prompt) → L4 (LLM)
~1ms ~10ms ~2s ~3s
100% save 100% save 90% save Full cost
Redis Semantic Cache
from redisvl.index import SearchIndex
from redisvl.query import VectorQuery