ralph-kage-bunshin-verify

Fail

Audited by Gen Agent Trust Hub on Mar 28, 2026

Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill executes shell commands npm test and npm run build. It also dynamically parses package.json to identify and execute additional scripts (E2E, Playwright, Cypress) based on string matching in the scripts section.
  • [REMOTE_CODE_EXECUTION]: The skill is designed to run code from a worker's task. If the files in the project directory (especially package.json or test files) are malicious, the agent will execute arbitrary code on the host system when attempting to verify the task.
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection as it processes and follows logic based on content from untrusted files.
  • Ingestion points: Reads project files including package.json, .ralph/SPEC.md, CLAUDE.md, and .ralph/tasks.json to determine verification steps.
  • Boundary markers: None identified; the agent is instructed to read these files and use their content to guide testing and reporting.
  • Capability inventory: Shell execution via npm commands and dynamic script execution based on file content.
  • Sanitization: No sanitization or validation of the scripts found in package.json is performed before execution.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Mar 28, 2026, 02:34 PM