creating-skills
Pass
Audited by Gen Agent Trust Hub on May 15, 2026
Risk Level: SAFECOMMAND_EXECUTION
Full Analysis
- [COMMAND_EXECUTION]: Several utility scripts (e.g.,
run_eval.py,improve_description.py, andgenerate_review.py) utilize Python'ssubprocessmodule to invoke theclaudeCLI and system utilities likelsof. These operations are performed using structured list-based arguments rather than shell strings, which follows security best practices to prevent command injection. - [COMMAND_EXECUTION]: In
scripts/run_eval.pyandscripts/improve_description.py, the skill programmatically modifies environment variables to remove theCLAUDECODEguard. This is a functional requirement to enable the nested subagent calls necessary for automated skill benchmarking and optimization loops. - [EXTERNAL_DOWNLOADS]: The evaluation viewer (
eval-viewer/viewer.html) loads the SheetJS library from a well-known CDN (cdn.sheetjs.com) to render spreadsheet outputs within the browser. This is documented neutrally as a functional dependency for visualizing results. - [DATA_EXFILTRATION]: The
eval-viewer/generate_review.pyscript initializes a local HTTP server bound to127.0.0.1. This is used exclusively to serve the qualitative review interface and save user feedback locally tofeedback.jsonwithin the project workspace. - [PROMPT_INJECTION]: The skill's primary purpose is to process and refine other agent skills. It ingests user-provided skill content and test queries to generate improved descriptions. While this involves processing untrusted data, the skill employs structured prompts and boundary markers (e.g.,
<new_description>tags) to maintain control over the optimization process.
Audit Metadata