computer-use-agents
Pass
Audited by Gen Agent Trust Hub on May 12, 2026
Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill provides patterns for executing arbitrary shell commands via the Anthropic
bashtool and Python'ssubprocessmodule, which are used to capture screenshots and perform system operations. - [COMMAND_EXECUTION]: Utilizes the
pyautoguilibrary to simulate human interactions such as clicking, typing, and scrolling, granting the agent full control over the desktop GUI. - [PROMPT_INJECTION]: The architecture employs a vision-based reasoning loop (Category 8) where the agent analyzes screenshots to determine its next action. This creates a risk of indirect prompt injection where malicious instructions embedded in a website or document displayed on screen could hijack the agent's logic.
- Ingestion points: Visual data enters the system through the
capture_screenshotmethod inperception-reasoning-action-loop.mdandanthropic-computer-use-implementation.md. - Boundary markers: No explicit markers or "ignore instruction" guidelines are implemented to help the vision model distinguish between safe system instructions and untrusted content found in screenshots.
- Capability inventory: The skill enables high-risk actions including arbitrary shell execution (
BetaToolBash20241022) and low-level GUI input manipulation (pyautogui). - Sanitization: The implementation lacks automated sanitization or human-in-the-loop verification steps for actions derived from vision-based reasoning.
Audit Metadata