second-opinion

Fail

Audited by Gen Agent Trust Hub on Apr 29, 2026

Risk Level: HIGHEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
  • [EXTERNAL_DOWNLOADS]: The skill instructs the agent to download and install packages and extensions from untrusted third-party sources.
  • It suggests installing global npm packages: @openai/codex and @google/gemini-cli, which are not official packages from those organizations.
  • It directs the installation of extensions from an unverified GitHub organization: https://github.com/gemini-cli-extensions/code-review and https://github.com/gemini-cli-extensions/security.
  • [COMMAND_EXECUTION]: The skill executes complex shell commands involving piped data and external CLI tools.
  • Use of gemini with the --yolo flag explicitly bypasses confirmation for tool execution, allowing the CLI to run any command requested by the remote model or its extensions.
  • Use of codex exec involves running external code locally, although it specifies a read-only sandbox.
  • [REMOTE_CODE_EXECUTION]: The combination of the --yolo flag and untrusted extensions facilitates Remote Code Execution (RCE).
  • If an external model (Gemini or Codex) or a malicious extension decides to execute a shell command as part of its 'review', the --yolo flag ensures it happens without human or agent intervention.
  • [DATA_EXFILTRATION]: The skill's primary function involves reading sensitive local project data and sending it to external APIs.
  • Project diffs, including uncommitted changes and tracked files, are collected and sent to OpenAI and Google servers.
  • Optionally includes the contents of CLAUDE.md or AGENTS.md (project conventions) in the external transmission.
  • [PROMPT_INJECTION]: The skill is highly vulnerable to Indirect Prompt Injection.
  • Ingestion points: Untrusted project source code diffs and file contents are gathered via git diff and Read tools (SKILL.md, references/codex-invocation.md).
  • Boundary markers: Uses --- delimiters to separate code from instructions, but lacks explicit directives to ignore instructions embedded within the code being reviewed.
  • Capability inventory: The gemini CLI can execute arbitrary tool calls via extensions when run with the --yolo flag (SKILL.md).
  • Sanitization: There is no evidence of sanitization or filtering of the diff content before it is passed to the LLM-powered CLI tools.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Apr 29, 2026, 01:46 PM