codex-readiness-unit-test
LLM Codex Readiness Unit Test
Instruction-first, in-session "readiness" for evaluating AGENTS/PLANS documentation quality without any external APIs or SDKs. All checks run against the current working directory (cwd), with no monorepo discovery. Each run writes to .codex-readiness-unit-test/<timestamp>/ and updates .codex-readiness-unit-test/latest.json. Keep execution deterministic (filesystem scanning + local command execution only). All LLM evaluation happens in-session and must output strict JSON via the provided references.
Quick Start
- Collect evidence:
python skills/codex-readiness-unit-test/bin/collect_evidence.py
- Run deterministic checks:
python skills/codex-readiness-unit-test/bin/deterministic_rules.py
- Run LLM checks using references in
references/and store.codex-readiness-unit-test/<timestamp>/llm_results.json. - If execute mode is requested, build a plan, get confirmation, run:
python skills/codex-readiness-unit-test/bin/run_plan.py --plan .codex-readiness-unit-test/<timestamp>/plan.json
- Generate the report:
python skills/codex-readiness-unit-test/bin/scoring.py --mode read-only|execute
Outputs (per run, under .codex-readiness-unit-test/<timestamp>/):
report.jsonreport.html
More from openai/skills
screenshot
Use when the user explicitly asks for a desktop or system screenshot (full screen, specific app or window, or a pixel region), or when tool-specific capture capabilities are unavailable and an OS-level capture is needed.
2.7Ksecurity-best-practices
Perform language and framework specific security best-practice reviews and suggest improvements. Trigger only when the user explicitly requests security best practices guidance, a security review/report, or secure-by-default coding help. Trigger only for supported languages (python, javascript/typescript, go). Do not trigger for general code review, debugging, or non-security tasks.
2.5Kfigma
Use the Figma MCP server to fetch design context, screenshots, variables, and assets from Figma, and to translate Figma nodes into production code. Trigger when a task involves Figma URLs, node IDs, design-to-code implementation, or Figma MCP setup and troubleshooting.
2.4Kplaywright
Use when the task requires automating a real browser from the terminal (navigation, form filling, snapshots, screenshots, data extraction, UI-flow debugging) via `playwright-cli` or the bundled wrapper script.
2.4Kpdf
Use when tasks involve reading, creating, or reviewing PDF files where rendering and layout matter; prefer visual checks by rendering pages (Poppler) and use Python tools such as `reportlab`, `pdfplumber`, and `pypdf` for generation and extraction.
2.3Kfigma-implement-design
Translates Figma designs into production-ready application code with 1:1 visual fidelity. Use when implementing UI code from Figma files, when user mentions "implement design", "generate code", "implement component", provides Figma URLs, or asks to build components matching Figma specs. For Figma canvas writes via `use_figma`, use `figma-use`.
2.2K