skill-creator

Pass

Audited by Gen Agent Trust Hub on Apr 7, 2026

Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTIONDATA_EXFILTRATIONREMOTE_CODE_EXECUTIONNO_CODE
Full Analysis
  • [COMMAND_EXECUTION]: The scripts run_eval.mjs and improve_description.mjs use the node:child_process module to spawn the claude CLI tool. This is intended to automate the execution of test queries against the skill being developed or optimized.
  • [EXTERNAL_DOWNLOADS]: The evaluation viewer template (viewer.html) fetches the SheetJS library from a public CDN (cdn.sheetjs.com). This is a well-known service used here to facilitate the rendering of spreadsheet data in the local result viewer.
  • [INDIRECT_PROMPT_INJECTION]: The skill processes external data including skill definitions, evaluation queries, and execution transcripts. It uses these inputs to provide feedback and optimization suggestions. While this is an attack surface, the sub-agents (Grader, Analyzer, Comparator) are provided with specific roles and instructions to rely on evidence from transcripts, which helps mitigate the risk of accidental obedience to embedded instructions.
  • Ingestion points: Reads eval-set JSON files, skill markdown files, and execution transcripts via node:fs.
  • Boundary markers: Employs structured variables and template interpolation for agent prompts.
  • Capability inventory: Includes file system read/write via node:fs and execution of the claude CLI via spawn.
  • Sanitization: Relies on standard JSON parsing and file handling; sub-agents are explicitly instructed to cite evidence from the processed data.
  • [DYNAMIC_EXECUTION]: The run_eval.mjs script dynamically creates temporary markdown files in the project's .claude/commands directory and executes queries against them using the claude CLI. This mechanism is necessary for the skill's purpose of testing how other skills are triggered.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 7, 2026, 11:35 AM