crawl

Installation

Summary

Extract and save website content as markdown files for offline access and analysis.

Supports configurable crawl depth (1-5 levels), breadth limits, and page caps to balance coverage against performance
Includes path filtering via regex patterns to focus on specific sections and exclude irrelevant content
Offers two modes: full-page extraction for data collection, or semantic chunking with natural language instructions for feeding results into LLM context
Provides a companion Map API for URL discovery without content extraction, useful for understanding site structure before full crawls
Authenticates via OAuth (Tavily account required) or API key; saves crawled pages as individual markdown files when output directory is specified

SKILL.md

Crawl Skill

Crawl websites to extract content from multiple pages. Ideal for documentation, knowledge bases, and site-wide content extraction.

Authentication

The script uses OAuth via the Tavily MCP server. No manual setup required - on first run, it will:

Check for existing tokens in ~/.mcp-auth/
If none found, automatically open your browser for OAuth authentication

Note: You must have an existing Tavily account. The OAuth flow only supports login — account creation is not available through this flow. Sign up at tavily.com first if you don't have an account.

Alternative: API Key

If you prefer using an API key, get one at https://tavily.com and add to ~/.claude/settings.json:

{
  "env": {
    "TAVILY_API_KEY": "tvly-your-api-key-here"

Related skills

More from tavily-ai/skills

Installs

3.6K

Repository

tavily-ai/skills

GitHub Stars

287

First Seen

Jan 25, 2026

Security Audits

Gen Agent Trust HubFail

SocketPass

SnykWarn

crawl

Crawl Skill

Authentication

Alternative: API Key

More from tavily-ai/skills

tavily-search

search

tavily-best-practices

tavily-research

research

tavily-extract