llm-obs-experiment-py-bootstrap

Installation
SKILL.md

LLM Obs Experiment (Python) Bootstrap — Generate a Python Experiment Using ddtrace.llmobs

Produce a single self-contained Python experiment that uses the official ddtrace.llmobs SDK. Output is either a .py script or an .ipynb notebook. The generated code mirrors the patterns shown in DataDog's reference notebooks at https://github.com/DataDog/llm-observability/tree/main/experiments/notebooks.

The SDK handles lazy project/experiment creation, dataset push diffing, the 5 MB / 1000-record bulk threshold, eval metric streaming, and the status state machine on the user's behalf. This skill must therefore never re-implement those primitives — it just imports LLMObs and trusts it.

Usage

/llm-obs-experiment-py-bootstrap [--format py|ipynb] [--dataset <path>] [--dataset-name <name>] [--dataset-version <int>] [--project-name <name>] [--evaluator-style function|class|remote] [--jobs <n>] [--output <path>]

Arguments: $ARGUMENTS

Inputs

All inputs are optional. If the user omits a flag, fall back to the default — never block on prompting for --jobs, --format, etc.

Installs
78
GitHub Stars
126
First Seen
May 20, 2026
llm-obs-experiment-py-bootstrap — datadog-labs/agent-skills