text-generation-inference
Installation
SKILL.md
Text Generation Inference (TGI)
Expert guidance for Hugging Face's production LLM inference server.
Triggers
Use this skill when:
- Deploying LLMs in production environments
- Setting up high-throughput model serving
- Configuring quantization for inference optimization
- Working with Hugging Face Text Generation Inference
- Implementing continuous batching or tensor parallelism
- Keywords: tgi, text generation inference, huggingface serving, llm deployment, continuous batching, tensor parallelism