run-agent-browser
Installation
SKILL.md
Browser Automation with agent-browser
Drive the agent-browser CLI as the agent's hands inside a browser: open URLs, snapshot interactive refs, fill forms, click, extract DOM, screenshot, switch tabs, persist sessions, run headed/stealth, or dispatch through providers. This skill owns ad hoc, terminal-driven browser tasks where a human-style operator loop (observe → act → verify) is appropriate.
When to use this skill
Use this skill when:
- the user names
agent-browser,npx agent-browser,@refsnapshots,snapshot -i, or any agent-browser flag (--session-name,--profile,--headed,-p browserbase|browseruse|kernel|ios,--engine lightpanda) - the task is one-off browser automation done by the agent itself — log in, click, fill a form, scrape data, take a screenshot, capture page state
- the workflow needs deterministic DOM-grounded verification (
get url,get text,get value,is visible,diff snapshot) rather than asserted test code - multi-tab, popup, OAuth, or session-isolated flows must reuse one browser context across many commands
- hosted, mobile, geo, or anti-bot pressure requires
--headed, stealth, profiles, or a remote provider through the same CLI - another skill (
convert-url-to-nextjs,extract-saas-design) needs live browser evidence — DOM, screenshots, runtime metadata, asset URLs — captured and handed back
Do NOT use this skill when:
- the deliverable is TypeScript code on
@onkernel/sdkor Kernel Apps → usebuild-kernel-ts-sdk - the deliverable is a rebuilt Next.js project from a captured site → ownership stays with
convert-url-to-nextjs; this skill is only invoked for capture - the deliverable is a SaaS visual-system writeup → ownership stays with
extract-saas-design; this skill is only invoked for browser evidence - the task is static research, DevTools-first profiling, or anything that does not require an active browser context