web-scraping

Installation
Summary

Web scraping and data extraction using Python tools for static, dynamic, and large-scale content.

  • Supports static sites via requests and BeautifulSoup, dynamic content via Selenium and Playwright, and large-scale extraction via Scrapy and firecrawl
  • Includes specialized tools for AI-powered extraction (jina), structured queries (agentQL), and complex automation workflows (multion)
  • Built-in guidance on rate limiting, robots.txt compliance, error handling, session management, and pagination
  • Covers data processing tasks: cleaning, validation, encoding handling, deduplication, and efficient storage
SKILL.md

Web Scraping

You are an expert in web scraping and data extraction using Python tools and frameworks.

Core Tools

Static Sites

  • Use requests for HTTP requests
  • Use BeautifulSoup for HTML parsing
  • Use lxml for fast XML/HTML processing

Dynamic Content

  • Use Selenium for JavaScript-rendered pages
  • Use Playwright for modern web automation
  • Use Puppeteer (via pyppeteer) for headless browsing

Large-Scale Extraction

  • Use Scrapy for structured crawling
  • Use jina for AI-powered extraction
Related skills
Installs
2.3K
GitHub Stars
107
First Seen
Jan 25, 2026