skill-creator
Skill Creator
A skill for creating new skills and iteratively improving them.
At a high level, the process of creating a skill goes like this:
- Decide what you want the skill to do and roughly how it should do it
- Write a draft of the skill
- Create a few test prompts and run claude-with-access-to-the-skill on them
- Help the user evaluate the results both qualitatively and quantitatively
- While the runs happen in the background, draft some quantitative evals if there aren't any (if there are some, you can either use as is or modify if you feel something needs to change about them). Then explain them to the user (or if they already existed, explain the ones that already exist)
- Use the
eval-viewer/generate_review.pyscript to show the user the results for them to look at, and also let them look at the quantitative metrics
- Rewrite the skill based on feedback from the user's evaluation of the results (and also if there are any glaring flaws that become apparent from the quantitative benchmarks)
- Repeat until you're satisfied
- Expand the test set and try again at larger scale
Your job when using this skill is to figure out where the user is in this process and then jump in and help them progress through these stages. So for instance, maybe they're like "I want to make a skill for X". You can help narrow down what they mean, write a draft, write the test cases, figure out how they want to evaluate, run all the prompts, and repeat.
On the other hand, maybe they already have a draft of the skill. In this case you can go straight to the eval/iterate part of the loop.
More from sjunepark/custom-skills
summarize
Use the steipete/summarize CLI to summarize URLs, local files, stdin, YouTube links, podcasts, and media with LLM models. Trigger when users ask to install or run summarize, configure model/provider API keys, tune output flags (length/language/json/extract/slides), set defaults in ~/.summarize/config.json, or troubleshoot summarize CLI errors.
42skills-cli
Operate the skills CLI to discover, install, list, update, remove, and initialize skills for Codex, Claude Code, and Pi. Use when users ask to manage skills from skills.sh, restore from lock files, sync skills from node_modules, or troubleshoot agent/installation scope (project vs global).
37post-implementation-review
Manually review already-implemented code for design flaws, abstraction issues, structural problems, or refactors that only became clear in real code. Use only when the user explicitly asks for a post-implementation review, explicitly asks whether recent implementation work revealed design or structure problems, or explicitly wants refactor recommendations after the code exists. Do not auto-trigger for ordinary implementation, debugging, explanation, or generic code review requests. Prefer embedded snippets with file-path comments over editor-oriented file and line references. Treat findings as signals about code shape and quality; prioritize root-cause design, ownership, abstraction, and organization improvements, including broad refactors when warranted, over bandage fixes such as tiny helper extractions or local polish.
30architecture-md-writer
Create, update, review, and split ARCHITECTURE.md files that explain a codebase's shape, major components, runtime flow, code map, and important invariants. Use when a repository lacks architecture docs, an existing ARCHITECTURE.md is stale or too detailed, a subsystem needs its own nested ARCHITECTURE.md, or a root architecture doc should link to deeper module architecture docs.
27agents-md-writer
Create, edit, review, and improve AGENTS.md files for repositories used by agentic coding tools with concise, actionable instructions and correct precedence behavior. Use whenever AGENTS.md content is being changed, including updating existing guidance, drafting a new AGENTS.md, migrating legacy instruction files, defining nested overrides in monorepos, or debugging why tools load the wrong guidance.
26source-investigator
Investigate external libraries, frameworks, and unfamiliar repositories by cloning the exact repo into a project-local temp workspace, ignoring that workspace in git, and delegating code reading to focused subagents so the main thread stays clean. Use whenever docs are incomplete, version-specific behavior matters, you need to learn how a codebase works, or exploring lots of source inline would pollute the main context.
24