chem-data-extractor
Pass
Audited by Gen Agent Trust Hub on Mar 30, 2026
Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The script
scripts/batch_extract.pyutilizessubprocess.runto coordinate PDF conversion and data extraction tasks. It executes the localextract_chem_data.pyscript and searches for a third-partymineru-pdf-converterskill in standard agent configuration directories. All commands are executed with structured argument lists, which is the secure way to use this functionality. - [PROMPT_INJECTION]: The skill processes untrusted text from chemical research papers, which presents an indirect prompt injection surface. Maliciously crafted documents could potentially influence the AI agent's behavior when it processes the extracted data.
- Ingestion points: Reads PDF files and Markdown content from user-specified directories and files in
scripts/batch_extract.pyandscripts/extract_chem_data.py. - Boundary markers: The skill does not implement specific delimiters or instructions to the agent to disregard potential instructions embedded within the extracted chemistry data fields.
- Capability inventory: The skill allows the agent to execute shell commands via Python's subprocess module and perform file system operations (read/write/delete).
- Sanitization: Regular expression patterns are used to extract specific fields (NMR, HRMS, etc.), which constrains the format of the output but does not sanitize the extracted strings for natural language instructions.
Audit Metadata