groq-performance-tuning
Installation
SKILL.md
Groq Performance Tuning
Overview
Maximize Groq's ultra-low-latency LPU inference. Groq delivers sub-100ms token generation; tuning focuses on streaming efficiency, prompt caching, model selection for speed vs quality, and parallel request orchestration.
Prerequisites
- Groq API key with rate limit awareness
groq-sdknpm package installed- Understanding of LLM token economics
- Monitoring for TTFT (time to first token)
Instructions
Step 1: Select Optimal Model for Speed
import Groq from 'groq-sdk';
const groq = new Groq({ apiKey: process.env.GROQ_API_KEY });
// Model speed tiers (approximate TTFT):
Related skills