text-generation-inference

Installation
SKILL.md

Text Generation Inference (TGI)

Expert guidance for Hugging Face's production LLM inference server.

Triggers

Use this skill when:

  • Deploying LLMs in production environments
  • Setting up high-throughput model serving
  • Configuring quantization for inference optimization
  • Working with Hugging Face Text Generation Inference
  • Implementing continuous batching or tensor parallelism
  • Keywords: tgi, text generation inference, huggingface serving, llm deployment, continuous batching, tensor parallelism

Installation

Docker

Installs
4
GitHub Stars
3
First Seen
Mar 15, 2026
text-generation-inference — housegarofalo/claude-code-base