computer-use-agents
Computer Use Agents
Patterns
Perception-Reasoning-Action Loop
The fundamental architecture of computer use agents: observe screen, reason about next action, execute action, repeat. This loop integrates vision models with action execution through an iterative pipeline.
Key components:
- PERCEPTION: Screenshot captures current screen state
- REASONING: Vision-language model analyzes and plans
- ACTION: Execute mouse/keyboard operations
- FEEDBACK: Observe result, continue or correct
Critical insight: Vision agents are completely still during "thinking" phase (1-5 seconds), creating a detectable pause pattern.
More from hainamchung/agent-assistant
spring-boot-engineer
Use when building Spring Boot 3.x applications, microservices, or reactive Java applications. Invoke for Spring Data JPA, Spring Security 6, WebFlux, Spring Cloud integration.
17embedded-systems
Use when developing firmware for microcontrollers, implementing RTOS applications, or optimizing power consumption. Invoke for STM32, ESP32, FreeRTOS, bare-metal, power optimization, real-time systems.
13expo-app-design
Build beautiful cross-platform mobile apps with Expo Router, NativeWind, and React Native.
13vulnerability-scanner
Advanced vulnerability analysis principles. OWASP 2025, Supply Chain Security, attack surface mapping, risk prioritization.
12copywriting
>
11cpp-pro
Write idiomatic C++ code with modern features, RAII, smart pointers, and STL algorithms. Handles templates, move semantics, and performance optimization.
11