MinerU Document Extractor

Pass

Audited by Gen Agent Trust Hub on May 12, 2026

Risk Level: SAFEEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONDATA_EXFILTRATION
Full Analysis
  • [EXTERNAL_DOWNLOADS]: The skill instructs the installation of the mineru-open-api package from the official NPM registry and its source from the author's GitHub repository (github.com/opendatalab/MinerU-Ecosystem). These are verified vendor resources.
  • [COMMAND_EXECUTION]: The skill uses the mineru-open-api CLI to perform document conversions, authentication, and web crawling tasks. The instructions include security-conscious practices such as quoting file paths to prevent command injection in the shell.
  • [DATA_EXFILTRATION]: The skill includes functionality to fetch content from external URLs via the crawl and extract commands. This is the primary intended purpose of the tool and is used to process documents provided by the user.
  • [INDIRECT_PROMPT_INJECTION]: A potential attack surface exists because the skill processes untrusted external data (PDFs, Word documents, and web pages). Malicious content embedded in these documents could attempt to influence the agent's behavior.
  • Ingestion points: External documents and URLs processed by flash-extract, extract, and crawl commands in SKILL.md.
  • Boundary markers: None explicitly defined for document content parsing.
  • Capability inventory: The skill has command execution and network access capabilities via the CLI.
  • Sanitization: SKILL.md instructs the agent to quote file paths, but static sanitization of the extracted document content is not present in the instructions.
Audit Metadata
Risk Level
SAFE
Analyzed
May 12, 2026, 09:15 AM