zeroeval-install

Related skills

More from zeroeval/zeroeval-skills

manage-data
Create, load, push, version, and manage benchmark datasets with the ZeroEval Python SDK or git. Use when adding data to a benchmark, creating a dataset from code or CSV, pushing data to the backend, managing subsets, pulling existing benchmarks, converting data to Parquet, or setting up a git-based data workflow. Triggers on "add data", "create dataset", "push dataset", "upload data", "manage benchmark data", "dataset versioning", "subsets", "pull dataset", "parquet", "multimodal dataset".
16
run-evals
Write tasks, evaluations, and scoring pipelines with the ZeroEval Python SDK. Covers defining @ze.task functions, running evals with dataset.eval(), writing row/column/run evaluators, scoring with column_map, emitting signals, configuring execution (workers, retries, checkpoints), repeating and resuming runs, and inspecting results. Triggers on "run evals", "write evaluation", "benchmark model", "score results", "evaluation pipeline", "task decorator", "scoring function", "column_map", "emit signal", "resume eval", "repeat eval".
16
create-judge
This skill should be used when users want to create, design, or configure an automated judge in ZeroEval. It guides through understanding the evaluation goal, choosing binary vs scored evaluation, writing the judge template, designing structured criteria, and creating the judge via dashboard or API. Triggers on "create a judge", "add a judge", "evaluate my LLM output", "set up automated evaluation", "judge template", or "scoring criteria".
11
prompt-migration
This skill should be used when users want to migrate hardcoded prompts to ze.prompt for version tracking, feedback collection, judge linkage, and prompt optimization. It covers the full migration workflow for both Python and TypeScript. Triggers on "migrate prompt", "ze.prompt", "hardcoded prompt", "prompt migration", "send feedback", "prompt optimization", "wire feedback", or "connect judges to prompts".
11
custom-tracing
This skill should be used when users want to send traces to ZeroEval without installing the SDK, using the REST API or OpenTelemetry (OTLP) directly. It covers direct HTTP span ingestion, OTLP collector configuration, and first-trace verification for any language. Triggers on "send traces via API", "direct API tracing", "custom tracing", "manual tracing", "without SDK", "unsupported language", "REST API tracing", "OTLP", "OpenTelemetry", or language cues like "Go", "Ruby", "Java", "Rust", "Elixir", or "PHP".
9