skill-code-review

Warn

Audited by Gen Agent Trust Hub on May 17, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONPROMPT_INJECTIONDATA_EXFILTRATION
Full Analysis
  • [PROMPT_INJECTION]: The skill uses strong imperative language such as "MANDATORY COMPLIANCE", "PROHIBITED", and "MUST" to override default agent behavior and enforce a specific persona and branding (e.g., "CLAUDE OCTOPUS ACTIVATED"). It also references a non-existent model "gpt-5.2-codex", a technique often used in persona-based prompt injections.
  • [COMMAND_EXECUTION]: The skill executes local shell scripts located in ${HOME}/.claude-octopus/ using LLM-generated strings as arguments (e.g., orchestrate.sh grasp "[review request]"). This creates a command injection surface if the LLM output contains shell metacharacters.
  • [DATA_EXFILTRATION]: The skill utilizes the GitHub CLI (gh) to transmit LLM-generated review summaries to external Pull Requests. This allows for the automated exfiltration of data processed during the review if the LLM is manipulated.
  • [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection as it processes untrusted code changes.
  • Ingestion points: File contents and git diff output are read from the local repository (SKILL.md).
  • Boundary markers: There are no explicit delimiters or instructions to ignore embedded commands within the code being analyzed.
  • Capability inventory: The skill has the ability to read files (grep), execute scripts (orchestrate.sh), and communicate with GitHub (gh).
  • Sanitization: LLM-generated content is interpolated into shell commands without visible sanitization or validation logic.
Audit Metadata
Risk Level
MEDIUM
Analyzed
May 17, 2026, 07:16 AM
Security Audit — agent-trust-hub — skill-code-review