skill-creator

Warn

Audited by Gen Agent Trust Hub on Apr 22, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill relies extensively on the subprocess module to execute external commands. Specifically, run_eval.py, improve_description.py, and run_loop.py use the claude CLI to test skill triggering and optimize instructions. Additionally, generate_review.py uses lsof to manage network ports.
  • [COMMAND_EXECUTION]: The eval-viewer/generate_review.py script starts a local HTTP server on 127.0.0.1 (port 3117) using Python's HTTPServer. This server is unauthenticated and serves all files within the workspace directory, including scripts and execution outputs, to allow the user to review agent performance.
  • [PROMPT_INJECTION]: The skill exhibits an indirect prompt injection surface as it ingests untrusted data that is later re-processed by the agent or passed to the claude CLI.
  • Ingestion points: Test prompts are ingested from evals/evals.json and user-provided task descriptions in SKILL.md.
  • Boundary markers: The scripts use YAML block scalars (|) when generating temporary skill files for testing to encapsulate descriptions, but provide no specific boundaries for user queries.
  • Capability inventory: The skill has the ability to execute shell commands via subprocess, perform file system writes (e.g., to .claude/commands/, feedback.json, grading.json), and run a local network server.
  • Sanitization: There is no evidence of explicit sanitization, filtering, or escaping of the user-provided prompts before they are executed in the test environment.
  • [EXTERNAL_DOWNLOADS]: The results viewer (viewer.html) fetches well-known resources from external CDNs, including the SheetJS library from cdn.sheetjs.com and typography from Google Fonts. These are used for legitimate rendering and data visualization purposes.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Apr 22, 2026, 03:37 PM