reviewing-code
Fail
Audited by Gen Agent Trust Hub on May 4, 2026
Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill's sub-agent logic in
SKILL.codex.mdinvokes dynamic test execution tools, includinggo test,pytest, andbun test, on the codebase being reviewed. Executing tests on untrusted code (such as unvetted PR changes) is a high-risk operation that allows for arbitrary code execution on the system where the agent is running. - [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection because it ingests untrusted source code and includes it in sub-agent prompts without boundary markers or instructions to ignore embedded commands. This could allow an attacker to hijack the agent's behavior via malicious comments in the code.
- Ingestion points: Code is read using
git diffand direct file access in bothSKILL.mdandSKILL.codex.md. - Boundary markers: No delimiters or safety instructions are used when interpolating code into agent tasks.
- Capability inventory: The skill possesses the ability to execute shell commands (
Bash) and spawn specialized agents (Task). - Sanitization: The skill performs no validation or escaping of the ingested code before processing.
Recommendations
- AI detected serious security threats
Audit Metadata