playwright-scraper

Installation
Summary

Web scraping for dynamic content, authentication, pagination, and data extraction using Playwright.

  • Handles JavaScript-rendered sites, login flows, and multi-page navigation with built-in wait strategies and selector management
  • Supports headless and visible browser modes, with async patterns for reliable automation across flaky elements
  • Extracts data via selectors with JSON output, captures screenshots and PDFs, and manages cookies and sessions per context
  • Configure via JSON files or environment variables; integrates with Node.js 14+ and supports proxy settings for network flexibility
SKILL.md

playwright-scraper

Purpose

This skill enables web scraping using Playwright, a Node.js library for browser automation. It focuses on handling dynamic content, authentication flows, pagination, data extraction, and screenshots to reliably scrape modern websites.

When to Use

Use this skill for scraping sites with JavaScript-rendered content (e.g., React or Angular apps), sites requiring login (e.g., dashboards), handling multi-page results (e.g., search results), or capturing visual data (e.g., screenshots for verification). Avoid for static HTML sites where simpler tools like requests suffice.

Key Capabilities

  • Dynamically load and interact with content using Playwright's browser control.
  • Manage authentication flows, such as logging in via forms or API tokens.
  • Handle pagination by navigating pages, clicking "next" buttons, or parsing URLs.
  • Extract data using selectors, with options for JSON output or file saves.
  • Capture screenshots or full-page PDFs for debugging or reporting.
  • Supports headless or visible browser modes for flexibility.

Usage Patterns

Always initialize a browser context first, then create pages for navigation. Use async patterns for reliability. For authenticated scraping, handle cookies or sessions per context. Structure scripts to loop through pages for pagination and use try-catch for flaky elements. Pass configurations via JSON files or environment variables for reusability.

Related skills
Installs
1.4K
GitHub Stars
5
First Seen
Feb 28, 2026