skillforge
Skillforge — write and optimize Claude Code skills the right way
For full Anthropic-authoritative guidance (frontmatter fields, 500-line budget, dynamic context injection, testing framework, anti-patterns, the 9 skill types, the 5 workflow patterns), load: references/anthropic-skill-best-practices.md.
Two modes
| Mode | Use for | Output |
|---|---|---|
| forge (default) | A new skill that doesn't exist yet | A new skill dir, drafted per the process below |
| optimize | An existing skill that works but should be better | A V2 of that skill — measurably better at its outcome |
forge follows the process + checklist in the rest of this file.
optimize <skill> runs a metric-driven loop: define the outcome + metric → set gates (incl. a no-cheating audit) → quality audit → research the domain for outcome-improving techniques → synthesize V2 with a changelog → verify V2 beats V1 on a held-out benchmark, not a single example, discarding any candidate that fails a gate. "Optimize," not "tidy": a cleanup that doesn't move the outcome is not a V2, and a score that jumped by gaming the rubric is a regression. The loop is self-contained; for a heavy run (many hypotheses, parallel experiments, hours) you can optionally escalate to an external optimizer if you have one (ce-optimize plugin, evo, or Microsoft's SkillOpt). Full playbook: references/optimize-mode.md.
Meta-process: iterate first, extract second
Anthropic's recommended creation flow: iterate on a single challenging task until Claude succeeds, then extract the winning approach into a skill. Don't write skills for hypothetical future needs. Solve the real problem in conversation, find the prompt + context shape that works, freeze it.
More from ngmeyer/skills
session-close
>
1six-pager
>
1claude-md
>
1session-recover
>
1council-review
Run any question, plan, PR, or code through a Diverse Multi-Agent Debate (DMAD) council of 5 AI advisors with distinct reasoning methods. Advisors collaborate, peer-review each other anonymously, and a chairman synthesizes a verdict. Empirically outperforms adversarial debate (M3MADBench 2026, DMAD ICLR 2025). Use when: 'council this', 'run the council', 'council review', 'pressure-test this', 'stress-test this', 'war room this', or when facing a genuine decision with stakes and tradeoffs.
1adversarial-review
>
1