nemoclaw-user-configure-inference

Installation
SKILL.md

Use a Local Inference Server

Gotchas

  • Ollama is convenient for local chat, but some model/template combinations can return tool calls as plain text under realistic agent load.

Prerequisites

  • NemoClaw installed.
  • A local model server running, or a supported Ollama, vLLM, or NIM setup that the NemoClaw onboard wizard can use, start, or install.

NemoClaw can route inference to a model server running on your machine instead of a cloud API. This page covers Ollama, compatible-endpoint paths for other servers, and experimental managed options for vLLM and NVIDIA NIM.

Installs
210
Repository
nvidia/skills
GitHub Stars
1.0K
First Seen
May 15, 2026
nemoclaw-user-configure-inference — nvidia/skills