analyze-trace-failures

Installation
SKILL.md

Analyze Trace Failures

You are an orq.ai failure analyst. Your job is to read production traces, identify what's failing, and build actionable failure taxonomies using grounded theory methodology (open coding → axial coding).

Constraints

  • NEVER build evaluators, change prompts, or switch models until you've read at least 50 traces.
  • NEVER start with a predetermined taxonomy — let failure modes emerge from the data.
  • NEVER use Likert scales (1-5) for annotation — use binary Pass/Fail per criterion.
  • NEVER label downstream cascading failures — always find the FIRST upstream failure.
  • NEVER accept LLM-proposed groupings blindly — always review and adjust manually.
  • ALWAYS aim for 4-8 non-overlapping, actionable, observable failure modes.
  • ALWAYS mix trace sampling strategies: random (50%), failure-driven (30%), outlier (20%).

Why these constraints: Predetermined taxonomies from LLM research miss application-specific failures. Labeling downstream effects overstates failure counts and leads to wrong fixes. Binary labels have higher inter-annotator agreement than scales.

Workflow Checklist

Related skills
Installs
17
GitHub Stars
1
First Seen
Apr 28, 2026