huggingface-local-models

Installation
SKILL.md

Hugging Face Local Models

Search the Hugging Face Hub for llama.cpp-compatible GGUF repos, choose the right quant, and launch the model with llama-cli or llama-server.

Default Workflow

  1. Search the Hub with apps=llama.cpp.
  2. Open https://huggingface.co/<repo>?local-app=llama.cpp.
  3. Prefer the exact HF local-app snippet and quant recommendation when it is visible.
  4. Confirm exact .gguf filenames with https://huggingface.co/api/models/<repo>/tree/main?recursive=true.
  5. Launch with llama-cli -hf <repo>:<QUANT> or llama-server -hf <repo>:<QUANT>.
  6. Fall back to --hf-repo plus --hf-file when the repo uses custom file naming.
  7. Convert from Transformers weights only if the repo does not already expose GGUF files.

Quick Start

Install llama.cpp

Related skills

More from huggingface/skills

Installs
144
GitHub Stars
10.5K
First Seen
Apr 22, 2026