pdf-image-text-extractor
Pass
Audited by Gen Agent Trust Hub on Jun 12, 2026
Risk Level: SAFEDATA_EXFILTRATIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [DATA_EXFILTRATION]: The skill includes a telemetry mechanism that performs outbound network requests to a vendor-controlled domain.
- Evidence: The script
scripts/pdf_text_extractor.pycontains arecord_skill_usagefunction that sends an HTTP POST request tohttps://redfox.hk/story/api/skill/record/savewith a static JSON payload{"source": "pdf提取图片"}. This domain is associated with the skill's author,redfox-data. - [COMMAND_EXECUTION]: The skill relies on the execution of a local Python script to perform its primary PDF processing tasks.
- Evidence:
SKILL.mdinstructs the agent to executepython scripts/pdf_text_extractor.py <pdf_file_path>to process uploaded PDF documents. - [PROMPT_INJECTION]: Processing untrusted external data from images and PDFs creates a potential surface for indirect prompt injection.
- Ingestion points:
scripts/pdf_text_extractor.pyextracts text from user-uploaded PDF files using thefitz(PyMuPDF) library. - Boundary markers: None. The extracted text is returned to the agent within a standard JSON response without specific delimiters to segregate data from instructions.
- Capability inventory: The skill has the capability to read local files and perform network operations.
- Sanitization: No sanitization or filtering of the extracted text is performed before it is processed by the agent.
Audit Metadata