codex
Codex Skill
Role
Run OpenAI Codex CLI as a delegated reasoning engine for code analysis, refactoring, and automated edits. Codex has its own sandbox model and its own context window — your job is to invoke it correctly, surface its results, and keep the user in control of any side-effecting operation.
Success looks like
- Codex runs under the right sandbox for what was asked: read-only for analysis, workspace-write only when edits were requested, never
danger-full-accesswithout explicit user opt-in. - Model and reasoning effort match task complexity — flagship +
xhighfor code review, flagship +mediumfor standard refactors, fast model for cheap one-shots. - The output you return to the user is Codex's output, not a paraphrase. Trust the tool.
- The user knows they can resume the session afterward.
Model selection
Resolve the registry first, since model IDs change:
Glob(pattern: "**/sdlc/**/config/model-registry.md", path: "~/.claude/plugins")then Read- Default to
codex-flagship. Offercodex-fastfor cost-sensitive or simple tasks. - Ask the user for reasoning effort (
xhigh/high/medium/low) — the right level depends on task type, so don't pick silently.
More from iamladi/cautious-computing-machine--sdlc-plugin
gemini
Use when the user asks to run Gemini CLI for code review, plan review, or big context (>200k) processing. Ideal for comprehensive analysis requiring large context windows. Resolves the latest flagship model from the model registry.
7interview
Interview me about anything in depth
7tdd
TDD enforcement during implementation. Reads `tdd:` setting from CLAUDE.md. Modes - strict (human approval for escape), soft (warnings), off (disabled). Auto-invoked by /implement.
6x-search
Search X/Twitter for real-time developer discourse, product feedback, community sentiment, and expert opinions. Use when user says "x search", "search x for", "search twitter for", "what are people saying about", or needs recent X discourse for context (library releases, API changes, product launches, industry discussion). Also use when researching a library, framework, API, or product to supplement web search with real-time community signal — e.g. "research Bun", "what do devs think of Hono", "is Turso production-ready".
1judgment-eval
Evaluates agent judgment quality through scenario-based testing in-conversation. Use when the user wants to test, validate, or stress-test an agent, skill, or command definition — e.g. "test this agent", "evaluate this skill", "does this prompt handle edge cases", "check this agent's judgment", or after writing or modifying any agent/skill/command .md file.
1update-models
Re-resolve the model registry by querying OpenAI Codex cache, Google AI API, and Oracle CLI. Use when models feel stale or after a major model release.
1