mineru-extract
Installation
SKILL.md
MinerU Extract (official API)
Use MinerU as an upstream “content normalizer”: submit a URL to MinerU, poll for completion, download the result zip, and extract the main Markdown.
Quick start (MCP-aligned)
We align to the MinerU MCP mental model, but we do not run an MCP server.
- Primary script (MCP-style):
scripts/mineru_parse_documents.py- Input:
--file-sources(comma/newline-separated) - Output: JSON contract on stdout:
{ ok, items, errors }
- Input:
- Low-level script (single URL):
scripts/mineru_extract.py
Auth:
- Set
MINERU_TOKEN(Bearer token from mineru.net)
Default model heuristic:
- URLs ending with
.pdf/.doc/.ppt/.png/.jpg→pipeline - Otherwise →
MinerU-HTML(best for HTML pages like WeChat articles)
Related skills
More from blessonism/openclaw-search-skills
content-extract
Robust URL-to-Markdown extraction for OpenClaw workflows. Use when the user wants to "extract/summarize/convert a webpage to markdown" (especially WeChat mp.weixin.qq.com) and web_fetch/browser is blocked or messy. Uses a cheap probe via web_fetch first, then falls back to the official MinerU API (via the local mineru-extract skill) and returns a traceable result contract with source links.
154search-layer
>
137