universal-scraping-architect
Installation
SKILL.md
Universal Scraping Architect
Design complete, robust data-extraction pipelines with intelligent routing, validation, and token-budget tracking — not brittle one-off scripts.
Dependency Notice: BYOK (Bring Your Own Key) pattern for Firecrawl; API keys must only be loaded via environment variables. Per-script dependencies:
| Script | Dependencies | Exact CLI |
|---|---|---|
scripts/validate_extraction.py |
stdlib only | python3 scripts/validate_extraction.py output.json --json |
scripts/firecrawl_example.py |
firecrawl, requests (template; --sample runs offline) |
python3 scripts/firecrawl_example.py --sample |
scripts/local_bs4_example.py |
beautifulsoup4, pandas (template; --sample runs offline) |
python3 scripts/local_bs4_example.py --sample |
Before Starting
Check for context first:
If project-context.md exists, read it before asking questions. Determine the target data format, scale of extraction, and deployment environment before writing any code.
How This Skill Works
This skill supports 3 extraction modes based on intelligent routing: