skill-creator

Pass

Audited by Gen Agent Trust Hub on Apr 16, 2026

Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTIONEXTERNAL_DOWNLOADS
Full Analysis
  • [COMMAND_EXECUTION]: The skill executes various system and platform commands via Python's subprocess module to facilitate its workflow.\n
  • Evidence: scripts/run_eval.py and scripts/improve_description.py call the claude CLI tool to run trigger evaluations and optimize skill descriptions.\n
  • Evidence: eval-viewer/generate_review.py uses lsof and kill to manage the local web server's port during initialization.\n
  • Evidence: SKILL.md instructs the agent to run nohup python to start the benchmark viewer server in the background.\n- [PROMPT_INJECTION]: The skill's optimization loop processes untrusted data which is then used to construct prompts for iterative skill improvement.\n
  • Ingestion points: scripts/improve_description.py reads the content of the skill being developed (SKILL.md) and the results of trigger evaluations as training data.\n
  • Boundary markers: The script uses XML-style tags (e.g., <skill_content>, <scores_summary>) to delimit external data within the prompt.\n
  • Capability inventory: The skill has access to shell command execution and file system writes.\n
  • Sanitization: No explicit sanitization or escaping of the ingested skill instructions was observed before interpolation into optimization prompts.\n- [EXTERNAL_DOWNLOADS]: The benchmark viewer UI loads a well-known JavaScript library from a public content delivery network.\n
  • Evidence: eval-viewer/viewer.html includes a script tag for SheetJS from cdn.sheetjs.com to enable spreadsheet rendering in the review interface.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 16, 2026, 02:50 PM