ralph-kage-bunshin-loop

Pass

Audited by Gen Agent Trust Hub on Mar 28, 2026

Risk Level: SAFECOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill executes system commands including git worktree, npm test, and npm run build to manage the codebase and verify implementations.
  • [DATA_EXFILTRATION]: Reports task status and errors to a local watcher service at http://127.0.0.1 using curl. This is a standard coordination pattern for local agent loops and does not involve external exfiltration.
  • [PROMPT_INJECTION]: Identifies a potential surface for indirect prompt injection from task descriptions and external reference sites. It mitigates this by providing explicit instructions to the agent to treat external content as raw data and not as authoritative instructions.
  • Evidence Chain for Indirect Prompt Injection:
  • Ingestion points: Reads task descriptions from .ralph/tasks.json and external site data via reverse-engineering tools.
  • Boundary markers: Uses an 'external content warning' to delimit untrusted data in the prompt instructions.
  • Capability inventory: Includes full shell execution, file modification, and local network access.
  • Sanitization: Relies on explicit prompt-based instructions to the agent rather than automated sanitization.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 28, 2026, 02:34 PM