web-content-fetcher
Installation
Summary
Extract clean Markdown article content from URLs with three-tier fallback strategies.
- Implements cascading extraction methods: Jina Reader (fast, 200 requests/day free), Scrapling + html2text (unlimited, handles paywalled content), and direct web_fetch (static pages fallback)
- Preserves Markdown structure including headings, links, images, lists, code blocks, and blockquotes
- Domain-aware routing skips Jina for WeChat articles, Zhihu, Juejin, and CSDN to conserve quota and improve success rates
- Requires
scrapling[fetchers]andhtml2textdependencies; includes built-in script atscripts/fetch.pyfor manual Scrapling extraction
SKILL.md
Web Content Fetcher
Given a URL, return its main content as clean Markdown — headings, links, images, lists, code blocks all preserved.
Extraction Strategy
Always try one method per URL — don't cascade blindly. Pick the right one upfront.