go-calibration-audit
Installation
SKILL.md
GO Calibration Audit
v1.0 — March 2026
The Problem
We have a multi-task classifier that detects customer complaints, flags regulatory test failures, and identifies vulnerable customers. It's accurate — but we suspect the confidence scores are off.
Downstream systems depend on these scores: dashboards, alert thresholds, prioritization queues, regulatory reporting. If the confidence scores don't mean what they say, none of those systems work properly.
Our job: figure out whether the scores are trustworthy, understand why or why not, fix them if needed, and write up what to ship.
How This Works
This is a pair exercise. The agent's job is to be a thinking partner — ask questions, provide scaffolding, help debug, challenge assumptions. Not to write the solution for you.
Related skills