llm-inference

Installation
SKILL.md

LLM Inference

High-performance inference engines for serving large language models.


Engine Comparison

Engine Best For Hardware Throughput Setup
vLLM Production serving GPU Highest Medium
llama.cpp Local/edge, CPU CPU/GPU Good Easy
TGI HuggingFace models GPU High Easy
Ollama Local desktop CPU/GPU Good Easiest
TensorRT-LLM NVIDIA production NVIDIA GPU Highest Complex

Decision Guide

Related skills
Installs
53
Repository
eyadsibai/ltk
GitHub Stars
4
First Seen
Jan 28, 2026