groq-performance-tuning

Installation
SKILL.md

Groq Performance Tuning

Overview

Maximize Groq's ultra-low-latency LPU inference. Groq delivers sub-100ms token generation; tuning focuses on streaming efficiency, prompt caching, model selection for speed vs quality, and parallel request orchestration.

Prerequisites

  • Groq API key with rate limit awareness
  • groq-sdk npm package installed
  • Understanding of LLM token economics
  • Monitoring for TTFT (time to first token)

Instructions

Step 1: Select Optimal Model for Speed

import Groq from 'groq-sdk';

const groq = new Groq({ apiKey: process.env.GROQ_API_KEY });

// Model speed tiers (approximate TTFT):
Related skills
Installs
25
GitHub Stars
2.2K
First Seen
Jan 25, 2026