Scrape

Installation
SKILL.md

Pre-Scrape Compliance Checklist

Before writing any scraping code:

  1. robots.txt — Fetch {domain}/robots.txt, check if target path is disallowed. If yes, stop.
  2. Terms of Service — Check /terms, /tos, /legal. Explicit scraping prohibition = need permission.
  3. Data type — Public factual data (prices, listings) is safer. Personal data triggers GDPR/CCPA.
  4. Authentication — Data behind login is off-limits without authorization. Never scrape protected content.
  5. API available? — If site offers an API, use it. Always. Scraping when API exists often violates ToS.

Legal Boundaries

  • Public data, no login — Generally legal (hiQ v. LinkedIn 2022)
  • Bypassing barriers — CFAA violation risk (Van Buren v. US 2021)
  • Ignoring robots.txt — Gray area, often breaches ToS (Meta v. Bright Data 2024)
  • Personal data without consent — GDPR/CCPA violation
  • Republishing copyrighted content — Copyright infringement

Request Discipline

Installs
8
GitHub Stars
19
First Seen
May 15, 2026
Scrape — bighardperson/computer-science-skills-collection