skill-code-review
Warn
Audited by Gen Agent Trust Hub on May 17, 2026
Risk Level: MEDIUMCOMMAND_EXECUTIONPROMPT_INJECTIONDATA_EXFILTRATION
Full Analysis
- [PROMPT_INJECTION]: The skill uses strong imperative language such as "MANDATORY COMPLIANCE", "PROHIBITED", and "MUST" to override default agent behavior and enforce a specific persona and branding (e.g., "CLAUDE OCTOPUS ACTIVATED"). It also references a non-existent model "gpt-5.2-codex", a technique often used in persona-based prompt injections.
- [COMMAND_EXECUTION]: The skill executes local shell scripts located in
${HOME}/.claude-octopus/using LLM-generated strings as arguments (e.g.,orchestrate.sh grasp "[review request]"). This creates a command injection surface if the LLM output contains shell metacharacters. - [DATA_EXFILTRATION]: The skill utilizes the GitHub CLI (
gh) to transmit LLM-generated review summaries to external Pull Requests. This allows for the automated exfiltration of data processed during the review if the LLM is manipulated. - [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection as it processes untrusted code changes.
- Ingestion points: File contents and git diff output are read from the local repository (SKILL.md).
- Boundary markers: There are no explicit delimiters or instructions to ignore embedded commands within the code being analyzed.
- Capability inventory: The skill has the ability to read files (
grep), execute scripts (orchestrate.sh), and communicate with GitHub (gh). - Sanitization: LLM-generated content is interpolated into shell commands without visible sanitization or validation logic.
Audit Metadata