pdf-text-extractor
Pass
Audited by Gen Agent Trust Hub on Apr 1, 2026
Risk Level: SAFEEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [EXTERNAL_DOWNLOADS]: The skill utilizes the
pdfjs-distpackage from the NPM registry to perform PDF parsing operations. - [COMMAND_EXECUTION]: The skill uses the Node.js
fsmodule to read file content from the local file system based on thepdfPathprovided in the tool parameters. - [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection because it ingests untrusted text from external PDF files and returns it to the agent without sanitization or boundary markers. A maliciously crafted PDF could contain instructions designed to override the agent's behavior.
- Ingestion points:
index.jsreads file content usingfs.readFileSyncfrom a path provided as a tool argument. - Boundary markers: No delimiters or "ignore instructions" warnings are present in the code or instructions to mitigate the impact of text extracted from untrusted sources.
- Capability inventory: The skill has access to the local file system and returns full document text to the agent's context.
- Sanitization: The skill does not perform any validation, escaping, or filtering on the text extracted from the PDF pages.
- [PROMPT_INJECTION]: The skill metadata is misleading. Both
README.mdandSKILL.mdclaim the skill has "zero dependencies" and provides "OCR support" using Tesseract.js. However, thepackage.jsonidentifies a dependency onpdfjs-dist, and theindex.jsimplementation contains no OCR logic or Tesseract integration.
Audit Metadata