skill-creator
Pass
Audited by Gen Agent Trust Hub on Apr 1, 2026
Risk Level: SAFE
Full Analysis
- [COMMAND_EXECUTION]: The skill utilizes several Python scripts (
run_eval.py,eval_compare.py,optimize_description.py) that invoke external CLI tools viasubprocess.run(). Specifically, it calls theclaudeCLI to run benchmarks and thegotoolchain to validate generated code. These calls are essential for the skill's primary function of measuring and verifying skill performance. The scripts use list-based arguments rather than shell strings, which significantly mitigates the risk of shell injection. - [DYNAMIC_EXECUTION]: The
eval_compare.pyscript executes compiler and linter checks (go build,go test,go vet) on code produced during evaluation runs. While this involves running dynamically generated content, it is restricted to a local workspace and is the intended behavior for a software development evaluation tool. - [INDIRECT_PROMPT_INJECTION]: The skill's core workflow involves ingesting untrusted 'test prompts' or 'eval queries' and passing them to an LLM via the
claudeCLI. This creates a surface for indirect prompt injection; however, the skill is explicitly designed for testing and measurement, and its instructions include safety-conscious patterns such as warning against hardcoded secrets and encouraging the use of gates to verify outcomes.
Audit Metadata