groq-performance-tuning

Installation
SKILL.md

Groq Performance Tuning

Overview

Maximize Groq's ultra-low-latency LPU inference. Groq delivers sub-100ms token generation; tuning focuses on streaming efficiency, prompt caching, model selection for speed vs quality, and parallel request orchestration.

Prerequisites

  • Groq API key with rate limit awareness
  • groq-sdk npm package installed
  • Understanding of LLM token economics
  • Monitoring for TTFT (time to first token)

Instructions

Step 1: Select Optimal Model for Speed

import Groq from 'groq-sdk';

const groq = new Groq({ apiKey: process.env.GROQ_API_KEY });
Installs
25
GitHub Stars
2.4K
First Seen
Jan 25, 2026
groq-performance-tuning — jeremylongshore/claude-code-plugins-plus-skills