langchain-cost-tuning
Installation
SKILL.md
LangChain Cost Tuning
Overview
Reduce LLM API costs while maintaining quality: token tracking callbacks, model tiering (route simple tasks to cheap models), caching for duplicate queries, prompt compression, and budget enforcement.
Current Pricing Reference (2026)
| Provider | Model | Input $/1M | Output $/1M |
|---|---|---|---|
| OpenAI | gpt-4o | $2.50 | $10.00 |
| OpenAI | gpt-4o-mini | $0.15 | $0.60 |
| Anthropic | claude-sonnet | $3.00 | $15.00 |
| Anthropic | claude-haiku | $0.25 | $1.25 |
| OpenAI | text-embedding-3-small | $0.02 | - |
Strategy 1: Token Usage Tracking
import { BaseCallbackHandler } from "@langchain/core/callbacks/base";
Related skills