pdf-text-extractor

Pass

Audited by Gen Agent Trust Hub on Apr 1, 2026

Risk Level: SAFEEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [EXTERNAL_DOWNLOADS]: The skill utilizes the pdfjs-dist package from the NPM registry to perform PDF parsing operations.
  • [COMMAND_EXECUTION]: The skill uses the Node.js fs module to read file content from the local file system based on the pdfPath provided in the tool parameters.
  • [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection because it ingests untrusted text from external PDF files and returns it to the agent without sanitization or boundary markers. A maliciously crafted PDF could contain instructions designed to override the agent's behavior.
  • Ingestion points: index.js reads file content using fs.readFileSync from a path provided as a tool argument.
  • Boundary markers: No delimiters or "ignore instructions" warnings are present in the code or instructions to mitigate the impact of text extracted from untrusted sources.
  • Capability inventory: The skill has access to the local file system and returns full document text to the agent's context.
  • Sanitization: The skill does not perform any validation, escaping, or filtering on the text extracted from the PDF pages.
  • [PROMPT_INJECTION]: The skill metadata is misleading. Both README.md and SKILL.md claim the skill has "zero dependencies" and provides "OCR support" using Tesseract.js. However, the package.json identifies a dependency on pdfjs-dist, and the index.js implementation contains no OCR logic or Tesseract integration.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 1, 2026, 03:18 PM