Web Scraping & Data Extraction Engine

Installation
SKILL.md

Web Scraping & Data Extraction Engine

Quick Health Check (Run First)

Score your scraping operation (2 points each):

Signal Healthy Unhealthy
Legal compliance robots.txt checked, ToS reviewed Scraping blindly
Architecture Tool matches site complexity Using Puppeteer for static HTML
Anti-detection Rotation, delays, fingerprint diversity Single IP, no delays
Data quality Validation + dedup pipeline Raw dumps, no cleaning
Error handling Retry logic, circuit breakers Crashes on first 403
Monitoring Success rates tracked, alerts set No visibility
Storage Structured, deduplicated, versioned Flat files, duplicates
Scheduling Appropriate frequency, off-peak Hammering during business hours

Score: /16 → 12+: Production-ready | 8-11: Needs work | <8: Stop and redesign

Installs
Repository
openclaw/skills
GitHub Stars
4.5K
First Seen
Web Scraping & Data Extraction Engine — openclaw/skills