groq
Installation
SKILL.md
Groq
Groq provides ultra-fast LLM inference using its custom LPU (Language Processing Unit) hardware. The API is fully OpenAI-compatible, so any workflow that works against api.openai.com can be pointed at api.groq.com/openai/v1 with minimal changes.
Official docs:
https://console.groq.com/docs/overview
When to Use
Use this skill when you need to:
- Run chat completions at extremely low latency (Groq LPU is significantly faster than GPU-based inference)
- Use open-weight models such as Llama 3.3 70B, Llama 3.1 8B, Mixtral 8x7B, or Gemma 2 9B
- Transcribe audio using Whisper via an OpenAI-compatible endpoint
- List available models on Groq's platform
- Drop in a fast, cost-effective inference backend where OpenAI compatibility is assumed