evaluation-suites

Installation
SKILL.md

Evaluation Suites

Structured testing with assertions (plain strings for LLM judge) and execution policies.

from opik import Opik

client = Opik()
suite = client.get_or_create_evaluation_suite(
    name="my-suite",
    assertions=["Response is factually accurate", "Response is professional"],
    execution_policy={"runs_per_item": 3, "pass_threshold": 2},
)

suite.add_item(data={"input": "What is ML?"})
suite.add_item(
    data={"input": "Should I take this medication?"},
    assertions=["Response advises consulting a doctor"],  # item-level, added to suite-level
)
Installs
5
GitHub Stars
3
First Seen
Mar 30, 2026
evaluation-suites — comet-ml/opik-skills