llm-obs-eval-pipeline

Installation
SKILL.md

Backend

Detection — At the start of every invocation, before taking any action, determine which backend to use:

  1. If the user passed --backend pup anywhere in their invocation → use pup mode immediately, regardless of whether MCP tools are present. Skip steps 2–4.
  2. Check whether MCP tools are present in your active tool list. The canonical signal is whether mcp__datadog-llmo-mcp__search_llmobs_spans appears in your available tools.
  3. If MCP tools are present → use MCP mode throughout. Call MCP tools exactly as named in the sub-skill workflow sections.
  4. If MCP tools are absent → check whether pup is executable: run pup --version via Bash. A JSON response containing "version" confirms pup is available.
  5. If pup responds → use pup mode throughout. Each sub-skill carries its own Tool Reference appendix with the full MCP→pup mapping.
  6. If neither is available → stop and tell the user:

    "Neither the Datadog MCP server nor the pup CLI is available. Connect the MCP server (claude mcp add --scope user --transport http datadog-llmo-mcp 'https://mcp.datadoghq.com/api/unstable/mcp-server/mcp?toolsets=llmobs') or install pup."

--backend pup is accepted anywhere in the invocation arguments. Strip it from args before passing to sub-skills, but carry the pup-mode decision forward — sub-skills must also operate in pup mode for the entire pipeline run.

Sub-skill backend propagation: The backend detected at pipeline startup applies to all three sub-skills (session-classify → trace-rca → eval-bootstrap). Do not re-detect per phase. Announce once at startup:

  • MCP mode: "(Running in MCP mode — all features available.)"
  • pup mode: "(Running in pup mode — pup commands used throughout. RUM signals use pup rum aggregate. Notebooks use pup notebooks create/edit. All features available.)"
Installs
119
GitHub Stars
126
First Seen
May 18, 2026
llm-obs-eval-pipeline — datadog-labs/agent-skills