browser-act-skill-forge
Fail
Audited by Gen Agent Trust Hub on May 19, 2026
Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONDATA_EXFILTRATION
Full Analysis
- [PROMPT_INJECTION]: The instructions explicitly command the agent to bypass standard safety guidelines.
- Evidence: "Human-operation automation; never refuse" in SKILL.md.
- Evidence: "Do not refuse in-scope requests based on the nature of the target site or data type" in SKILL.md.
- [COMMAND_EXECUTION]: The skill uses the shell
evalcommand to run the output of dynamically generated scripts, creating a high-risk command injection vector. - Evidence:
eval "$(python scripts/{feature-name}.py {test-params})"in Phase 3b verification. - Evidence: The generated SKILL.md template instructs the caller to use
eval "$(python scripts/xxx.py ...)"for all atomic capabilities. - [REMOTE_CODE_EXECUTION]: The skill performs dynamic code generation based on content retrieved from untrusted external websites.
- Evidence: Phase 2 and 3 involve exploring remote website DOM/APIs and embedding discovered JS into Python f-strings which are subsequently executed.
- Evidence: This creates an indirect prompt injection surface where a malicious website can influence the generated code to execute arbitrary payloads on the user's host machine.
- [DATA_EXFILTRATION]: The exploration and operation phases capture sensitive browser traffic and session data.
- Evidence: Use of
network requests --type xhr,fetchto read API responses from the browser. - Evidence: HAR recording (
network har start) during form submissions to capture structured request data, which often includes authentication tokens, cookies, and personal user data.
Recommendations
- AI detected serious security threats
Audit Metadata