creating-skills

Pass

Audited by Gen Agent Trust Hub on May 15, 2026

Risk Level: SAFECOMMAND_EXECUTION
Full Analysis
  • [COMMAND_EXECUTION]: Several utility scripts (e.g., run_eval.py, improve_description.py, and generate_review.py) utilize Python's subprocess module to invoke the claude CLI and system utilities like lsof. These operations are performed using structured list-based arguments rather than shell strings, which follows security best practices to prevent command injection.
  • [COMMAND_EXECUTION]: In scripts/run_eval.py and scripts/improve_description.py, the skill programmatically modifies environment variables to remove the CLAUDECODE guard. This is a functional requirement to enable the nested subagent calls necessary for automated skill benchmarking and optimization loops.
  • [EXTERNAL_DOWNLOADS]: The evaluation viewer (eval-viewer/viewer.html) loads the SheetJS library from a well-known CDN (cdn.sheetjs.com) to render spreadsheet outputs within the browser. This is documented neutrally as a functional dependency for visualizing results.
  • [DATA_EXFILTRATION]: The eval-viewer/generate_review.py script initializes a local HTTP server bound to 127.0.0.1. This is used exclusively to serve the qualitative review interface and save user feedback locally to feedback.json within the project workspace.
  • [PROMPT_INJECTION]: The skill's primary purpose is to process and refine other agent skills. It ingests user-provided skill content and test queries to generate improved descriptions. While this involves processing untrusted data, the skill employs structured prompts and boundary markers (e.g., <new_description> tags) to maintain control over the optimization process.
Audit Metadata
Risk Level
SAFE
Analyzed
May 15, 2026, 03:16 PM
Security Audit — agent-trust-hub — creating-skills