ai-infrastructure-huggingface-inference

Installation
SKILL.md

Hugging Face Inference Patterns

Quick Guide: Use @huggingface/inference (v4+) to access 200k+ ML models on the Hugging Face Hub. Use InferenceClient with chatCompletion() for OpenAI-compatible chat, textGeneration() for raw text completion, chatCompletionStream() for streaming, featureExtraction() for embeddings, textToImage() for image generation, and automaticSpeechRecognition() for audio transcription. Set provider to route through inference providers (Cerebras, Together, Groq, etc.) or use endpointUrl for dedicated Inference Endpoints.


<critical_requirements>

CRITICAL: Before Using This Skill

All code must follow project conventions in CLAUDE.md (kebab-case, named exports, import ordering, import type, named constants)

(You MUST always pass an access token to InferenceClient -- never deploy without authentication)

(You MUST use chatCompletion() / chatCompletionStream() for conversational LLM tasks -- these follow the OpenAI-compatible message format)

(You MUST handle errors using InferenceClientError and its subclasses -- never use bare catch blocks without error type checking)

(You MUST specify a model parameter for every inference call -- there is no default model)

Related skills
Installs
2
GitHub Stars
6
First Seen
Apr 7, 2026