nemoclaw-user-configure-inference
Installation
SKILL.md
Use a Local Inference Server
Gotchas
- Ollama is convenient for local chat, but some model/template combinations can return tool calls as plain text under realistic agent load.
Prerequisites
- NemoClaw installed.
- A local model server running, or a supported Ollama, vLLM, or NIM setup that the NemoClaw onboard wizard can use, start, or install.
NemoClaw can route inference to a model server running on your machine instead of a cloud API. This page covers Ollama, compatible-endpoint paths for other servers, and experimental managed options for vLLM and NVIDIA NIM.