crawlee-web-scraper
SKILL.md
crawlee-web-scraper
Drop-in replacement for web_fetch when sites block automated requests. Crawlee handles session management, retry logic, and bot-detection evasion automatically.
Scripts
crawlee_fetch.py— main scraper; accepts a single URL or a file of URLs; returns JSONcrawlee_http.py— library helper; triesrequestsfirst, falls back to Crawlee on 403/429/503
Usage
# Single URL, return HTML preview
python3 scripts/crawlee_fetch.py --url "https://example.com"
# Single URL, extract text (strips HTML tags)
python3 scripts/crawlee_fetch.py --url "https://example.com" --extract-text
# Bulk scrape from file