crawler
Installation
SKILL.md
Crawler
Web crawling and scraping reference — robots.txt protocol, Scrapy framework, anti-bot detection, headless browsers, and legal considerations. No API keys or credentials required — outputs reference documentation only.
Commands
| Command | Description |
|---|---|
intro |
Crawling vs scraping, robots.txt, sitemap |
standards |
HTTP caching, structured data, meta tags |
troubleshooting |
Anti-bot detection, JS rendering, encoding |
performance |
Concurrency, dedup, incremental, distributed |
security |
Legal landscape, ethical guidelines, proxies |
migration |
BeautifulSoup to Scrapy, requests to Playwright |
cheatsheet |
Scrapy commands, CSS/XPath, curl, user-agents |
faq |
Legality, JS pages, blocking, storage |