do-execute-review

Warn

Audited by Gen Agent Trust Hub on May 8, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill performs shell command execution through various tools. It runs git commands to analyze changes and, more significantly, executes project-defined scripts like npm test, pnpm test, or bun test in Step 6. Running scripts defined in a repository's package.json allows for arbitrary command execution if the repository content is malicious.
  • [PROMPT_INJECTION]: The 'Autonomous Execution Policy' contains instructions that explicitly forbid the agent from pausing or waiting for user confirmation during its procedure. This directive removes human-in-the-loop oversight, which is a critical safety control when the agent is performing high-risk operations like code execution.
  • [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection due to its processing of external, untrusted project files:
  • Ingestion points: The agent reads content from PRDs (prd.md), technical specifications (techspec.md), task lists (tasks.md), and git diff outputs.
  • Boundary markers: There are no instructions provided to the agent to treat these external files as untrusted data or to ignore potential instructions embedded within them.
  • Capability inventory: The skill possesses extensive capabilities including file system access, tool execution (MCP), and shell command execution.
  • Sanitization: No validation or sanitization of the file content is performed before it is processed by the agent to influence its review logic.
Audit Metadata
Risk Level
MEDIUM
Analyzed
May 8, 2026, 02:50 AM