Web Scraping Skill — Chrome (Playwright) + DuckDuckGo

A privacy-minded, agent-facing web-scraping skill that uses headless Chrome (Playwright/Puppeteer) and DuckDuckGo for search. Focuses on: reliable navigation, extracting structured text, obeying robots.txt, and rate-limiting.

When to use

Collect public webpage content for summarization, metadata extraction, or link discovery.
Use DuckDuckGo for queries when you want a privacy-respecting search source.
NOT for bypassing paywalls, scraping private/logged-in content, or violating Terms of Service.

Safety & etiquette

Always check and respect /robots.txt before scraping a site.
Rate-limit requests (default: 1 request/sec) and use polite User-Agent strings.
Avoid executing arbitrary user-provided JavaScript on scraped pages.
Only scrape public content; if login is required, return login_required instead of attempting to bypass.

Capabilities

Search DuckDuckGo and return top-N result links.
Visit result pages in headless Chrome and extract title, meta description, main text (or best-effort article text), and canonical URL.
Return results as structured JSON for downstream consumption.

using-web-scraping

Web Scraping Skill — Chrome (Playwright) + DuckDuckGo

When to use

Safety & etiquette

Capabilities