groq-reference-architecture
Installation
SKILL.md
Groq Reference Architecture
Overview
Production architecture for ultra-fast LLM inference with Groq LPU. Covers model routing by latency requirements, streaming pipelines, fallback strategies, and integration patterns for real-time AI applications.
Prerequisites
- Groq API key
groq-sdknpm package- Understanding of model capabilities (Llama, Mixtral)
- Monitoring for latency and token usage
Architecture Diagram
┌─────────────────────────────────────────────────────┐
│ Application Layer │
│ Chat UI │ API Backend │ Batch Processor │ Agent │
└──────────┬──────────────┬───────────────┬───────────┘
│ │ │
▼ ▼ ▼
Related skills