The Agent Skills Directory

[COMMAND_EXECUTION]: Several Python scripts utilize the subprocess module to execute system commands. Specifically, scripts/run_eval.py invokes the claude CLI to run evaluation queries, and eval-viewer/generate_review.py uses lsof and os.kill to manage the lifecycle of the local web server. These are standard operations for a developer tool designed to automate CLI workflows.
[EXTERNAL_DOWNLOADS]: The evaluation viewer component (eval-viewer/viewer.html) loads the SheetJS library from a well-known CDN (cdn.sheetjs.com) and utilizes Google Fonts. These assets are used to render complex data formats like Excel spreadsheets and to provide a consistent UI for the local reporting dashboard.
[INDIRECT_PROMPT_INJECTION]: The skill represents a surface for indirect prompt injection as it ingests and processes untrusted user data (test prompts) and their resulting outputs during the evaluation phase.
Ingestion points: Evaluation prompts are read from evals/evals.json and external files provided by the user.
Boundary markers: The skill architecture leverages subagents (via the Agent tool) to isolate the execution context of individual test runs from the main orchestrator.
Capability inventory: The skill can read/write files, execute shell commands via included scripts, and perform network requests via the Anthropic API.
Sanitization: The skill relies on the underlying LLM's reasoning and the structure of the claude CLI to handle potentially malicious instructions in test data.

skill-creator