pull-llamacpp-model

Installation
SKILL.md

Pull a llamacpp Model

This machine uses kyuz0/amd-strix-halo-toolboxes:rocm-7.2 for llamacpp inference (AMD Strix Halo / gfx1151 — not supported by the official ROCm build). Harbor's pull mechanism starts an ephemeral container with --n-gpu-layers 0; the custom image fails in that context without ROCm device access. Use the standard CPU image just for pulling, then restore.

Steps

1. Switch to the standard CPU image

harbor config set llamacpp.image.rocm ghcr.io/ggml-org/llama.cpp:server

2. Pull the model

harbor pull <hf-owner/model-repo:quantization>
# Examples:
harbor pull bartowski/Qwen2.5-7B-Instruct-GGUF:Q4_K_M
harbor pull unsloth/Mistral-Small-3.1-24B-Instruct-2503-GGUF:UD-Q4_K_XL
Related skills

More from av/skills

Installs
2
Repository
av/skills
GitHub Stars
5
First Seen
Apr 4, 2026