error-analysis
Installation
SKILL.md
Error Analysis
Guide the user through reading LLM pipeline traces and building a catalog of how the system fails.
Overview
- Collect ~100 representative traces
- Read each trace, judge pass/fail, and note what went wrong
- Group similar failures into categories
- Label every trace against those categories
- Compute failure rates to prioritize what to fix
Core Process
Step 1: Collect Traces
Capture the full trace: input, all intermediate LLM calls, tool uses, retrieved documents, reasoning steps, and final output.
Target: ~100 traces. This is roughly where new traces stop revealing new kinds of failures. The number depends on system complexity.
Related skills