groq-cost-tuning

Installation
SKILL.md

Groq Cost Tuning

Overview

Optimize Groq inference costs by selecting the right model for each use case and managing token volume. Groq's pricing is extremely competitive (Llama 3.1 8B at ~$0.05/M tokens, Llama 3.3 70B at ~$0.59/M tokens, Mixtral at ~$0.24/M tokens), but high throughput (500+ tokens/sec) makes it easy to burn through large volumes quickly.

Prerequisites

  • Groq Cloud account with billing dashboard access
  • Understanding of which use cases need which model quality
  • Application-level request routing capability

Instructions

Step 1: Implement Smart Model Routing

// Route requests to cheapest model that meets quality requirements
const MODEL_ROUTING: Record<string, { model: string; costPer1MTokens: number }> = {
  'classification':  { model: 'llama-3.1-8b-instant',    costPer1MTokens: 0.05 },
  'summarization':   { model: 'llama-3.1-8b-instant',    costPer1MTokens: 0.05 },
  'code-review':     { model: 'llama-3.3-70b-versatile',  costPer1MTokens: 0.59 },
  'creative-writing':{ model: 'llama-3.3-70b-versatile',  costPer1MTokens: 0.59 },
Related skills
Installs
25
GitHub Stars
2.2K
First Seen
Jan 25, 2026