using-web-scraping

Installation
SKILL.md

Web Scraping Skill — Chrome (Playwright) + DuckDuckGo

A privacy-minded, agent-facing web-scraping skill that uses headless Chrome (Playwright/Puppeteer) and DuckDuckGo for search. Focuses on: reliable navigation, extracting structured text, obeying robots.txt, and rate-limiting.

When to use

  • Collect public webpage content for summarization, metadata extraction, or link discovery.
  • Use DuckDuckGo for queries when you want a privacy-respecting search source.
  • NOT for bypassing paywalls, scraping private/logged-in content, or violating Terms of Service.

Safety & etiquette

  • Always check and respect /robots.txt before scraping a site.
  • Rate-limit requests (default: 1 request/sec) and use polite User-Agent strings.
  • Avoid executing arbitrary user-provided JavaScript on scraped pages.
  • Only scrape public content; if login is required, return login_required instead of attempting to bypass.

Capabilities

  • Search DuckDuckGo and return top-N result links.
  • Visit result pages in headless Chrome and extract title, meta description, main text (or best-effort article text), and canonical URL.
  • Return results as structured JSON for downstream consumption.
Related skills
Installs
21
GitHub Stars
111
First Seen
Mar 1, 2026