The Agent Skills Directory

[COMMAND_EXECUTION]: The skill implements a command-check verification method that instructs the agent to run arbitrary shell commands via Bash based on the content of evaluation cases. This provides a direct mechanism for executing potentially unsafe system commands.
[REMOTE_CODE_EXECUTION]: The skill supports executing custom mcp-scripts via script-check and invokes a specific Python script (quick_validate.py) from a relative path on the filesystem. This allows the execution of code logic defined outside the skill's own instructions.
[PROMPT_INJECTION]: The skill has a significant indirect prompt injection surface as it ingests 'promoted learnings' and 'rules' from upstream processes. These inputs are used to define the verification logic and commands executed by the agent.
Ingestion points: Evaluation cases are created from learnings gathered by harness-updater or learning-aggregator.
Boundary markers: There are no delimiters or safety instructions provided to prevent malicious content within the 'learnings' from being interpreted as commands.
Capability inventory: The skill can execute shell commands (command-check), run scripts (script-check), and read/write project files.
Sanitization: No sanitization or validation of the ingested rules or command strings is performed before they are integrated into the testing harness.
[DATA_EXFILTRATION]: The skill is designed to read and verify the contents of sensitive project files such as CLAUDE.md and AGENTS.md. This access, coupled with the command execution capability, could be leveraged to exfiltrate project data if the verification logic is compromised.

eval-creator