rules-from-coding-agent-failures
Evidence-Driven Agent Rules
Capture observed agent failures, promote the recurring ones into rules, and score
advanced (Level 4–5) agent-readiness. For repos with a feedback signal: eval suites,
run-level telemetry, A/B baselines, or a skill catalog under test. No signal yet — no
benchmark, no telemetry baseline, no "we watched the agent do X" stream? Use
harden-repo-for-coding-agents alone; promotion here needs evidence to drive it.
Boundaries
Do NOT use for first-pass agent-readiness scaffolding when no feedback signal exists yet (use harden-repo-for-coding-agents), or to design or operate an AI product's eval and optimization loops (use agent-test / agent-ops).
Produces: capture writes the reflection-log scaffold (docs/reflection-log/README.md,
_template.md, and a repo-root README.md §Agents pointer); promote proposes the
smallest rule / hook / CI-gate diff plus the entries it cites; assess-l4l5 scores
Level 4–5 maturity with a ceiling and prioritized gaps. Tracked assess-l4l5 runs also
emit rules-from-coding-agent-failures-findings-ledger-<date>-<slug>.md and rules-from-coding-agent-failures-workflow-state-<date>-<slug>.json.