adversarial-reviewer

Pass

Audited by Gen Agent Trust Hub on Mar 27, 2026

Risk Level: SAFEPROMPT_INJECTION
Full Analysis
  • [PROMPT_INJECTION]: The skill processes untrusted external data (code diffs and project files) and incorporates them into the agent's execution context. It lacks defensive instructions or delimiters to mitigate risks where malicious instructions embedded in a diff could attempt to override the agent's behavior.
  • Ingestion points: Code diff content and file contents retrieved via Read, Grep, and Glob tools.
  • Boundary markers: The instructions do not specify any markers or "ignore embedded instructions" warnings to isolate untrusted code from the agent's core instructions.
  • Capability inventory: The skill has access to Bash, Read, Grep, and Glob tools, which provide significant control over the environment if an injection were successful.
  • Sanitization: No sanitization, validation, or escaping of the diff content is prescribed before it is processed by the agent.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 27, 2026, 05:25 PM