test-feature

Fail

Audited by Gen Agent Trust Hub on Mar 19, 2026

Risk Level: HIGHCOMMAND_EXECUTIONCREDENTIALS_UNSAFEREMOTE_CODE_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill uses a risky pattern of piping the output of a local network request (curl to port 9222) directly into the Node.js interpreter (node -p) to extract data from the Chrome DevTools Protocol JSON endpoint.
  • [CREDENTIALS_UNSAFE]: The environment setup logic automatically symlinks sensitive configuration files (.env, .env.test) from a hidden directory in the user's home folder (~/.inbox-zero/) into the application directory.
  • [DATA_EXFILTRATION]: The skill launches the system's Google Chrome application with the --remote-debugging-port=9222 flag enabled and points it to the user's actual profile directory (--user-data-dir). This exposes the user's authenticated sessions and private browser data to any process on the local machine that can connect to that port.
  • [REMOTE_CODE_EXECUTION]: Employs node -e to execute dynamically constructed JavaScript strings. These scripts use powerful modules like fs to read from standard input and WebSocket to communicate with browser internals, bypassing standard script boundaries.
  • [PROMPT_INJECTION]: The skill instructions include an open-ended directive to "verify that a feature works correctly by whatever means necessary." This high-pressure framing could lead the agent to attempt to bypass security filters or local constraints if it perceives a test is failing due to permissions.
  • [INDIRECT_PROMPT_INJECTION]:
  • Ingestion points: The skill reads untrusted data from git diff, git log, and full web page snapshots/DOM content via the agent-browser tool.
  • Boundary markers: Absent. The agent interpolates external data into its reasoning process without clear delimiters or instructions to ignore embedded commands.
  • Capability inventory: The skill has broad capabilities including file system modification (ln), package installation (pnpm install), background process management (pnpm dev), and full browser control.
  • Sanitization: There is no evidence of sanitization or escaping of the data fetched from external sources before it is processed.
Recommendations
  • HIGH: Downloads and executes remote code from: http://127.0.0.1:9222/json - DO NOT USE without thorough review
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Mar 19, 2026, 05:06 AM