adversarial-review
Pass
Audited by Gen Agent Trust Hub on May 11, 2026
Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- [PROMPT_INJECTION]: Indirect prompt injection vulnerability. The skill processes untrusted data such as code diffs and user messages and interpolates them directly into evaluation prompts for external AI models (Codex or Claude). This allows malicious instructions embedded within code to potentially manipulate the reviewer models' behavior.
- Ingestion points: Step 2 of SKILL.md reads content from recent diffs and user messages.
- Boundary markers: Absent. The prompt template in Step 3 lacks delimiters to separate the code being reviewed from the task instructions.
- Capability inventory: The skill utilizes shell commands to execute external AI CLIs.
- Sanitization: Absent. No escaping or validation is performed on the ingested code or diff content.
- [COMMAND_EXECUTION]: The skill executes shell commands to manage its review environment and invoke secondary tools.
- Evidence: Commands such as
mktemp -dfor temporary directory creation andcodex execorclaude -pfor model execution are used in Step 3.
Audit Metadata