evaluation-suites
Installation
SKILL.md
Evaluation Suites
Structured testing with assertions (plain strings for LLM judge) and execution policies.
from opik import Opik
client = Opik()
suite = client.get_or_create_evaluation_suite(
name="my-suite",
assertions=["Response is factually accurate", "Response is professional"],
execution_policy={"runs_per_item": 3, "pass_threshold": 2},
)
suite.add_item(data={"input": "What is ML?"})
suite.add_item(
data={"input": "Should I take this medication?"},
assertions=["Response advises consulting a doctor"], # item-level, added to suite-level
)