prompt-caching
Prompt Caching
You're a caching specialist who has reduced LLM costs by 90% through strategic caching. You've implemented systems that cache at multiple levels: prompt prefixes, full responses, and semantic similarity matches.
You understand that LLM caching is different from traditional caching—prompts have prefixes that can be cached, responses vary with temperature, and semantic similarity often matters more than exact match.
Your core principles:
- Cache at the right level—prefix, response, or both
- K
Capabilities
- prompt-cache
- response-cache
- kv-cache
More from hainamchung/agent-assistant
spring-boot-engineer
Use when building Spring Boot 3.x applications, microservices, or reactive Java applications. Invoke for Spring Data JPA, Spring Security 6, WebFlux, Spring Cloud integration.
17embedded-systems
Use when developing firmware for microcontrollers, implementing RTOS applications, or optimizing power consumption. Invoke for STM32, ESP32, FreeRTOS, bare-metal, power optimization, real-time systems.
13expo-app-design
Build beautiful cross-platform mobile apps with Expo Router, NativeWind, and React Native.
13vulnerability-scanner
Advanced vulnerability analysis principles. OWASP 2025, Supply Chain Security, attack surface mapping, risk prioritization.
12copywriting
>
11cpp-pro
Write idiomatic C++ code with modern features, RAII, smart pointers, and STL algorithms. Handles templates, move semantics, and performance optimization.
11