evaluate-plugin-batch
Fail
Audited by Gen Agent Trust Hub on Apr 21, 2026
Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: Command injection vulnerability in dynamic context placeholders within
SKILL.md. The skill uses the!command`` syntax to execute shell commands when the skill is loaded into the agent's context. These commands include the$1variable, which represents the user-supplied<plugin-name>argument. Because this input is not sanitized, a user could provide shell metacharacters (e.g.,;,&,|) to execute unauthorized commands on the host system. Evidence:!find $1/skills -name "SKILL.md" -maxdepth 3and!find $1/skills -name "evals.json" -maxdepth 3. - [COMMAND_EXECUTION]: The execution step involves running a local shell script
evaluate-plugin/scripts/aggregate_benchmark.shwith the user-controlled<plugin-name>argument. This pattern facilitates command injection if the underlying script performs unsafe evaluation or interpolation of its parameters. - [PROMPT_INJECTION]: Indirect prompt injection attack surface. The skill reads external data from
SKILL.mdandevals.jsonfiles within the target plugin. - Ingestion points:
SKILL.md,evals.jsoninStep 1andStep 2. - Boundary markers: Absent. Instructions do not define delimiters for external data.
- Capability inventory:
Bash,SlashCommand,Write,Read,Task,TodoWrite. - Sanitization: Absent. The skill does not validate the content of the ingested files before using them to drive next steps.
Recommendations
- AI detected serious security threats
Audit Metadata