omni-ai-eval
Omni AI Eval
Omni ships a first-class eval system (the AI Hub → Prompt sets and Eval runs). This skill drives it through the Omni CLI: define a reusable prompt set, start a judged eval run against a model or branch, and read per-prompt verdicts from Omni's built-in accuracy judge.
Prefer this native system over building your own harness. The judge scores each answer semantically against the full agent conversation — it does not require golden query JSON, and it evaluates the whole agentic workflow (topic selection, queries, results, and the final written answer), not just generated query structure.
Tip: Use
omni-ai-optimizerto improve scores after finding failures,omni-model-builderto apply context changes on a branch before A/B testing, andomni-model-explorerto discover topics and fields when writing prompts.
Prerequisites
# Verify the Omni CLI is installed — if not, ask the user to install it.
# See: https://github.com/exploreomni/cli#readme
command -v omni >/dev/null || echo "ERROR: Omni CLI is not installed."