Desktop Control

Installation
Summary

Control mouse, keyboard, and screen for cross-platform desktop automation.

  • Five command categories: mouse control (movement, clicks, drag, scroll), keyboard input (typing, hotkeys, key presses), screen capture and analysis (screenshots, image/text location via OCR), message dialogs (alerts, confirmations, prompts), and application control (open, focus, list windows)
  • Supports Windows, macOS, and Linux with platform-specific shortcuts and application launching
  • Image location with configurable confidence thresholds and OCR-based text finding within specific windows or active windows
  • All commands output structured JSON for reliable programmatic parsing by AI agents, with detailed error codes and recoverable error handling
  • Includes safety patterns: verify screen state before clicking, use window-specific screenshots for performance, validate element existence before interaction, and confirm destructive actions with user dialogs
SKILL.md

Desktop Control Skill

This skill provides comprehensive desktop automation capabilities through PyAutoGUI, allowing AI agents to control the mouse, keyboard, take screenshots, and interact with the desktop environment.

How to Use This Skill

As an AI agent, you can invoke desktop automation commands using the uvx desktop-agent CLI.

Command Structure

All commands follow this pattern:

uvx desktop-agent <category> <command> [arguments] [options]
Installs
GitHub Stars
2
First Seen