browser-automation

Installation
SKILL.md

Browser Automation

Available Tools

  • browser_act(instruction, starting_url?): Execute browser actions using natural language (click, type, scroll, select). Use starting_url to navigate to a page and act in a single call.
  • browser_get_page_info(url?, text?, tables?, links?): Get page structure and DOM data (fast, no AI). Use url to navigate first; text=True for full text, tables=True for table data, links=True for all links.
  • browser_manage_tabs(action, tab_index?, url?): Switch, close, or create browser tabs
  • browser_save_screenshot(filename): Save current page screenshot to workspace

When to Use

Use browser automation when the task genuinely requires it:

  • UI interactions: Filling forms, clicking buttons, navigating multi-step workflows
  • Login-required pages: Accessing content behind authentication that APIs cannot reach
  • Dynamic/JS-heavy pages: Content rendered client-side that plain HTTP requests can't capture
  • Human-like browsing needed: Sites that block bots or require realistic interaction patterns
  • Scraping structured data: When no API exists and the data must be extracted from rendered pages

Prefer web search or url_fetcher for general information lookup, news, or publicly accessible pages — browser automation is slower and heavier. Reserve it for tasks where simpler tools are insufficient.

Related skills
Installs
37
GitHub Stars
158
First Seen
Mar 1, 2026