chem-data-extractor

Pass

Audited by Gen Agent Trust Hub on Mar 30, 2026

Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The script scripts/batch_extract.py utilizes subprocess.run to coordinate PDF conversion and data extraction tasks. It executes the local extract_chem_data.py script and searches for a third-party mineru-pdf-converter skill in standard agent configuration directories. All commands are executed with structured argument lists, which is the secure way to use this functionality.
  • [PROMPT_INJECTION]: The skill processes untrusted text from chemical research papers, which presents an indirect prompt injection surface. Maliciously crafted documents could potentially influence the AI agent's behavior when it processes the extracted data.
  • Ingestion points: Reads PDF files and Markdown content from user-specified directories and files in scripts/batch_extract.py and scripts/extract_chem_data.py.
  • Boundary markers: The skill does not implement specific delimiters or instructions to the agent to disregard potential instructions embedded within the extracted chemistry data fields.
  • Capability inventory: The skill allows the agent to execute shell commands via Python's subprocess module and perform file system operations (read/write/delete).
  • Sanitization: Regular expression patterns are used to extract specific fields (NMR, HRMS, etc.), which constrains the format of the output but does not sanitize the extracted strings for natural language instructions.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 30, 2026, 06:44 AM