The Agent Skills Directory

[EXTERNAL_DOWNLOADS]: The skill uses npx -y @midscene/computer@1 to download the automation tool from the NPM registry during execution. It also allows fetching reference images from external URLs, such as GitHub assets, for visual targeting.\n- [REMOTE_CODE_EXECUTION]: By using npx, the skill executes code from a remote repository on the host machine.\n- [COMMAND_EXECUTION]: The skill uses the Bash tool to run commands that take full control of the desktop's mouse and keyboard interface.\n- [DATA_EXFILTRATION]: The take_screenshot command captures the user's entire desktop screen. This image data is transmitted to external AI model providers (e.g., Google Gemini, Aliyun Qwen) for processing, which may expose sensitive information visible in open windows.\n- [PROMPT_INJECTION]: The skill has an attack surface for indirect prompt injection because it interprets and acts upon instructions found within visual screenshots of the desktop environment.\n
Ingestion points: Full-screen desktop captures and reference images from external URLs.\n
Boundary markers: No explicit markers are present to separate user instructions from content found on the screen.\n
Capability inventory: Extensive system interaction including clicking, typing, dragging, and executing keyboard shortcuts.\n
Sanitization: The skill does not perform any visual sanitization or filtering before the vision model analyzes the screen content.

computer-automation