skill-creator
Pass
Audited by Gen Agent Trust Hub on Mar 24, 2026
Risk Level: SAFE
Full Analysis
- [COMMAND_EXECUTION]: The skill includes several Python scripts (
scripts/run_eval.py,scripts/improve_description.py,scripts/package_skill.py) that utilizesubprocess.runandsubprocess.Popen. These are used to interact with the system's official CLI tool (claude) to test skill triggering and to perform packaging operations. These actions are legitimate and necessary for the skill's purpose as an evaluation and development toolkit. - [EXTERNAL_DOWNLOADS]: The human review interface (
eval-viewer/viewer.html) references a well-known library for spreadsheet processing (SheetJS) hosted on a public Content Delivery Network (CDN). This is used for the legitimate purpose of rendering.xlsxfiles within the local viewer and does not constitute a security risk. - [DATA_EXFILTRATION]: The
eval-viewer/generate_review.pyscript starts a local web server (binding to127.0.0.1) to display evaluation results. This local server is used strictly for human-in-the-loop review of the task outputs generated during testing and does not transmit data to external third-party servers. - [REMOTE_CODE_EXECUTION]: The skill facilitates testing by spawning subagents to execute task prompts defined in
evals/evals.json. This is the primary intended behavior for a benchmarking tool, and the execution is confined within the platform's standard subagent safety boundaries. - [PROMPT_INJECTION]: The skill includes guidance on making skill descriptions "pushy" to improve triggering accuracy. This is a documented technique for managing LLM behavior and does not involve bypassing safety filters or overriding core ethical constraints.
Audit Metadata