debug
Debug: evidence-before-action investigation
The most expensive class of mistake in ML debugging is asserting a cause based on plausibility, then attempting a "fix" that masks the real problem. This skill enforces the discipline of probe → hypothesis → smoke → controls → claim, in that order.
The agentic Stop hook routes here from reason when an assistant claims a cause without backing tool output.
When to run
The user just said any of:
- "why is X failing / diverging / NaN / OOM / hung / slow / crashed"
- "the loss is going up", "metrics look weird", "GPU util is 0"
- "debug this", "diagnose", "troubleshoot", "investigate this run"
- pasted a log excerpt asking what's wrong
Five-step protocol
Step 1: cheap probes
More from fcakyon/phd-skills
paper-writing
>
8reviewer-defense
>
7dataset-curation
>
7reproduce
End-to-end paper reproduction from arxiv URL through smoke runs to replication experiments. Handles missing or partial official code, missing training scripts, missing hyperparameters, and private datasets via similar-public-dataset substitution. Use when the user asks to reproduce, implement, replicate, or re-run a paper from scratch, or pastes an arxiv URL with reproduction intent.
7experiment-design
>
7paper-verification
>
7