skill-creator
Pass
Audited by Gen Agent Trust Hub on Apr 23, 2026
Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
- [COMMAND_EXECUTION]: Several Python scripts utilize the
subprocessmodule to execute system commands. Specifically,scripts/run_eval.pyinvokes theclaudeCLI to run evaluation queries, andeval-viewer/generate_review.pyuseslsofandos.killto manage the lifecycle of the local web server. These are standard operations for a developer tool designed to automate CLI workflows. - [EXTERNAL_DOWNLOADS]: The evaluation viewer component (
eval-viewer/viewer.html) loads the SheetJS library from a well-known CDN (cdn.sheetjs.com) and utilizes Google Fonts. These assets are used to render complex data formats like Excel spreadsheets and to provide a consistent UI for the local reporting dashboard. - [INDIRECT_PROMPT_INJECTION]: The skill represents a surface for indirect prompt injection as it ingests and processes untrusted user data (test prompts) and their resulting outputs during the evaluation phase.
- Ingestion points: Evaluation prompts are read from
evals/evals.jsonand external files provided by the user. - Boundary markers: The skill architecture leverages subagents (via the Agent tool) to isolate the execution context of individual test runs from the main orchestrator.
- Capability inventory: The skill can read/write files, execute shell commands via included scripts, and perform network requests via the Anthropic API.
- Sanitization: The skill relies on the underlying LLM's reasoning and the structure of the
claudeCLI to handle potentially malicious instructions in test data.
Audit Metadata