web-content-fetcher

Installation
Summary

Extract clean Markdown article content from URLs with three-tier fallback strategies.

  • Implements cascading extraction methods: Jina Reader (fast, 200 requests/day free), Scrapling + html2text (unlimited, handles paywalled content), and direct web_fetch (static pages fallback)
  • Preserves Markdown structure including headings, links, images, lists, code blocks, and blockquotes
  • Domain-aware routing skips Jina for WeChat articles, Zhihu, Juejin, and CSDN to conserve quota and improve success rates
  • Requires scrapling[fetchers] and html2text dependencies; includes built-in script at scripts/fetch.py for manual Scrapling extraction
SKILL.md

Web Content Fetcher

Given a URL, return its main content as clean Markdown — headings, links, images, lists, code blocks all preserved.

Extraction Strategy

Always try one method per URL — don't cascade blindly. Pick the right one upfront.

Installs
2.5K
GitHub Stars
567
First Seen
Mar 9, 2026