ralph-kage-bunshin-verify
Fail
Audited by Gen Agent Trust Hub on Mar 28, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill executes shell commands
npm testandnpm run build. It also dynamically parsespackage.jsonto identify and execute additional scripts (E2E, Playwright, Cypress) based on string matching in thescriptssection. - [REMOTE_CODE_EXECUTION]: The skill is designed to run code from a worker's task. If the files in the project directory (especially
package.jsonor test files) are malicious, the agent will execute arbitrary code on the host system when attempting to verify the task. - [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection as it processes and follows logic based on content from untrusted files.
- Ingestion points: Reads project files including
package.json,.ralph/SPEC.md,CLAUDE.md, and.ralph/tasks.jsonto determine verification steps. - Boundary markers: None identified; the agent is instructed to read these files and use their content to guide testing and reporting.
- Capability inventory: Shell execution via
npmcommands and dynamic script execution based on file content. - Sanitization: No sanitization or validation of the scripts found in
package.jsonis performed before execution.
Recommendations
- AI detected serious security threats
Audit Metadata