nemo-retriever

Installation
SKILL.md

nemo-retriever

The retriever CLI indexes a folder of PDFs into LanceDB (retriever ingest) and serves vector search over it (retriever query). For any task about searching/answering questions across a folder of PDFs, use this CLI — do not write a custom RAG.

Beyond PDFs and beyond semantic search. retriever ingest also handles images, Office, HTML, TXT, audio, and video — see references/setup.md for the per-format recipe and references/install.md for the install extras ([multimedia], libreoffice, ffmpeg). For non-semantic operations — page filter, verbatim quote with citation, corpus-level aggregate, chart/image caption hits — see references/query.md. Don't fall back to native Read/Grep/Python on non-PDF inputs.

Install (if retriever is missing)

If command -v retriever returns nothing, follow references/install.md to install the NeMo Retriever Library before proceeding. It prints RETRIEVER_VENV=<path>; substitute that path for <RETRIEVER_VENV> in every example in this skill (setup, query, troubleshooting, and the CLI references).

Workflow — read the reference for the current phase, then execute

Turn type Read this once Then execute
Setup turn (first turn — ./lancedb/nv-ingest.lance doesn't exist) references/setup.md Build the index
Query turn (every subsequent turn — user asks a question) references/query.md One retriever query call
Anything errored or returned empty references/troubleshooting.md Apply the named recovery; do not improvise

For the full retriever ingest / retriever query CLI specs, see references/cli/ingest.md and references/cli/query.md. You do not need these for routine turns — <RETRIEVER_VENV>/bin/retriever <subcommand> --help is faster.

Installs
141
Repository
nvidia/skills
GitHub Stars
1.0K
First Seen
6 days ago
nemo-retriever — nvidia/skills