custom-tracing
Custom Tracing (Direct API / OTLP)
Guide users through sending traces to ZeroEval without the Python or TypeScript SDK, using the REST API or OpenTelemetry protocol.
When To Use
- The user's language or runtime has no ZeroEval SDK (Go, Ruby, Java, Rust, Elixir, PHP, etc.).
- The user wants to send spans over plain HTTP from any environment.
- The user already has OpenTelemetry instrumentation and wants to export to ZeroEval.
- The user prefers a vendor-neutral or SDK-free integration path.
- The user explicitly asks about
POST /spans, the REST API, or OTLP ingestion.
Do not use this skill when the user is working in Python or TypeScript and wants the full SDK experience (auto-instrumentation, ze.prompt, etc.). Use zeroeval-install instead.
Prerequisites
- A ZeroEval account and API key from Settings -> API Keys.
- An HTTP client or OpenTelemetry exporter in the user's language of choice.
More from zeroeval/zeroeval-skills
manage-data
Create, load, push, version, and manage benchmark datasets with the ZeroEval Python SDK or git. Use when adding data to a benchmark, creating a dataset from code or CSV, pushing data to the backend, managing subsets, pulling existing benchmarks, converting data to Parquet, or setting up a git-based data workflow. Triggers on "add data", "create dataset", "push dataset", "upload data", "manage benchmark data", "dataset versioning", "subsets", "pull dataset", "parquet", "multimodal dataset".
16run-evals
Write tasks, evaluations, and scoring pipelines with the ZeroEval Python SDK. Covers defining @ze.task functions, running evals with dataset.eval(), writing row/column/run evaluators, scoring with column_map, emitting signals, configuring execution (workers, retries, checkpoints), repeating and resuming runs, and inspecting results. Triggers on "run evals", "write evaluation", "benchmark model", "score results", "evaluation pipeline", "task decorator", "scoring function", "column_map", "emit signal", "resume eval", "repeat eval".
16create-judge
This skill should be used when users want to create, design, or configure an automated judge in ZeroEval. It guides through understanding the evaluation goal, choosing binary vs scored evaluation, writing the judge template, designing structured criteria, and creating the judge via dashboard or API. Triggers on "create a judge", "add a judge", "evaluate my LLM output", "set up automated evaluation", "judge template", or "scoring criteria".
11prompt-migration
This skill should be used when users want to migrate hardcoded prompts to ze.prompt for version tracking, feedback collection, judge linkage, and prompt optimization. It covers the full migration workflow for both Python and TypeScript. Triggers on "migrate prompt", "ze.prompt", "hardcoded prompt", "prompt migration", "send feedback", "prompt optimization", "wire feedback", or "connect judges to prompts".
11zeroeval-install
This skill should be used when users want to install, set up, or integrate ZeroEval into their AI application, agent, or pipeline. It covers SDK setup (Python and TypeScript), first-run tracing, ze.prompt migration, and judge recommendations. For non-SDK languages or direct API/OTLP ingestion it routes to the custom-tracing skill. Triggers on "install zeroeval", "set up zeroeval", "add tracing", "integrate zeroeval", "ze.prompt", "add judges", or "monitor my AI app".
10