Hugging Face Inference Patterns

Quick Guide: Use @huggingface/inference (v4+) to access 200k+ ML models on the Hugging Face Hub. Use InferenceClient with chatCompletion() for OpenAI-compatible chat, textGeneration() for raw text completion, chatCompletionStream() for streaming, featureExtraction() for embeddings, textToImage() for image generation, and automaticSpeechRecognition() for audio transcription. Set provider to route through inference providers (Cerebras, Together, Groq, etc.) or use endpointUrl for dedicated Inference Endpoints.

<critical_requirements>

CRITICAL: Before Using This Skill

All code must follow project conventions in CLAUDE.md (kebab-case, named exports, import ordering, import type, named constants)

(You MUST always pass an access token to InferenceClient -- never deploy without authentication)

(You MUST use chatCompletion() / chatCompletionStream() for conversational LLM tasks -- these follow the OpenAI-compatible message format)

(You MUST handle errors using InferenceClientError and its subclasses -- never use bare catch blocks without error type checking)

(You MUST specify a model parameter for every inference call -- there is no default model)

ai-infrastructure-huggingface-inference

Hugging Face Inference Patterns

CRITICAL: Before Using This Skill

More from agents-inc/skills

web-animation-css-animations

web-animation-view-transitions

web-testing-playwright-e2e

web-styling-cva

web-animation-framer-motion

web-i18n-next-intl