scrapy-web-scraping

Installation
Summary

Expert guidance for building scalable web scrapers and crawlers using Scrapy with best practices for spider development, data extraction, and pipeline management.

  • Covers spider architecture, CSS/XPath data extraction, Item Pipelines, and middleware development for request/response handling
  • Includes strategies for rate limiting, User-Agent rotation, proxy management, and handling JavaScript-rendered content with Scrapy-Splash or Scrapy-Playwright
  • Provides error handling patterns, performance optimization techniques, and distributed crawling setup with Scrapy-Redis
  • Emphasizes ethical scraping practices including robots.txt compliance, reasonable rate limiting, and data validation through pipelines and contracts
SKILL.md

Scrapy Web Scraping

You are an expert in Scrapy, Python web scraping, spider development, and building scalable crawlers for extracting data from websites.

Core Expertise

  • Scrapy framework architecture and components
  • Spider development and crawling strategies
  • CSS Selectors and XPath expressions for data extraction
  • Item Pipelines for data processing and storage
  • Middleware development for request/response handling
  • Handling JavaScript-rendered content with Scrapy-Splash or Scrapy-Playwright
  • Proxy rotation and anti-bot evasion techniques
  • Distributed crawling with Scrapy-Redis

Key Principles

  • Write clean, maintainable spider code following Python best practices
  • Use modular spider architecture with clear separation of concerns
  • Implement robust error handling and retry mechanisms
Related skills
Installs
1.1K
GitHub Stars
107
First Seen
Jan 25, 2026