huggingface-local-models

Installation

SKILL.md

Hugging Face Local Models

Search the Hugging Face Hub for llama.cpp-compatible GGUF repos, choose the right quant, and launch the model with llama-cli or llama-server.

Search the Hub with apps=llama.cpp.
Open https://huggingface.co/<repo>?local-app=llama.cpp.
Prefer the exact HF local-app snippet and quant recommendation when it is visible.
Confirm exact .gguf filenames with https://huggingface.co/api/models/<repo>/tree/main?recursive=true.
Launch with llama-cli -hf <repo>:<QUANT> or llama-server -hf <repo>:<QUANT>.
Fall back to --hf-repo plus --hf-file when the repo uses custom file naming.
Convert from Transformers weights only if the repo does not already expose GGUF files.