create-context-tests
create-context-tests
nao test runs each natural-language prompt through the agent, executes both the agent's SQL and the test's expected SQL against the warehouse, and diffs the result data row-by-row. A test passes only if the actual data matches — same rows, same values. The suite is the reliability benchmark; every change to RULES.md is measured against it. Reference: docs.getnao.io/nao-agent/context-engineering/evaluation.
How many tests
One test per key metric in ## Key Metrics Reference is the floor. Then add tests for: time scoping (especially "last 8 weeks" / "last 30 days"), CTE / multi-step queries, edge cases (NULLs, empty windows), and ambiguous wording ("our users", "active") to validate naming-convention rules.
Two authoring rules — apply to every test
Rule 1 — Prompts read like real chat. Vague, short, no table/column/method hints. The test verifies the agent reaches the right answer from a real-user input.
| Bad | Good |
|---|---|
"What was the churn rate from fct_subscriptions in Q1?" |
"How's churn looking this quarter?" |
"Compute MRR as SUM(mrr_amount) where status='active'" |
"What's our MRR?" |
Rule 2 — Output column names encode format / unit, not source. A column name communicates how to interpret the value.
More from getnao/nao
write-context-rules
Create or extend a nao project's RULES.md. Owns the RULES.md template. Use when the user wants to generate the initial RULES.md from synced metadata (called by setup-context), or improve their existing RULES.md. Do not use for first-time scope setup (use setup-context) or for diagnosing existing problems (use audit-context).
36setup-context
Bootstrap a nao agent for a project — gather warehouse + scope + extra-context info in one round, look up the warehouse-specific config from nao docs, write nao_config.yaml, run nao init + nao sync, set up the LLM key, and generate the first RULES.md. Use when the user has just decided to use nao on a new project. Only for first-time setup; for editing rules, generating tests, or reviewing an existing context, use write-context-rules / create-context-tests / audit-context.
34audit-context
Diagnose the health of a nao context at any stage of its lifecycle. Use when the user wants a structured review of what's been synced, how RULES.md compares to the target structure, whether every table is documented, whether the data model is MECE, whether tests exist and what their failures reveal, and whether context files are bloated. Outputs a structured audit report with ranked recommendations. Do not use for first-time setup (setup-context) or routine rule writing (write-context-rules).
34add-semantic-layer
Wire a semantic layer into a nao agent so that metric queries are routed through a single source of truth. Supports dbt MetricFlow (dbt Cloud with Semantic Layer), Snowflake (views or semantic views via MCP), an in-house nao YAML semantic layer, or other tools (via MCP discovery). Installs the right MCP server, updates RULES.md to route metric queries through the semantic layer, and (for the nao YAML option) generates starter metric files. Use after a first round of tests has shown the agent struggling with metric reliability. Do not use for raw rule writing (write-context-rules) or first-time setup (setup-context).
32