reviewing-code

Fail

Audited by Gen Agent Trust Hub on May 4, 2026

Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill's sub-agent logic in SKILL.codex.md invokes dynamic test execution tools, including go test, pytest, and bun test, on the codebase being reviewed. Executing tests on untrusted code (such as unvetted PR changes) is a high-risk operation that allows for arbitrary code execution on the system where the agent is running.
  • [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection because it ingests untrusted source code and includes it in sub-agent prompts without boundary markers or instructions to ignore embedded commands. This could allow an attacker to hijack the agent's behavior via malicious comments in the code.
  • Ingestion points: Code is read using git diff and direct file access in both SKILL.md and SKILL.codex.md.
  • Boundary markers: No delimiters or safety instructions are used when interpolating code into agent tasks.
  • Capability inventory: The skill possesses the ability to execute shell commands (Bash) and spawn specialized agents (Task).
  • Sanitization: The skill performs no validation or escaping of the ingested code before processing.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
May 4, 2026, 04:17 AM
Security Audit — agent-trust-hub — reviewing-code