python-data-pipelines
Pass
Audited by Gen Agent Trust Hub on May 14, 2026
Risk Level: SAFE
Full Analysis
- [SAFE]: The skill represents a benchmark for secure engineering practices. It provides thorough guidance on secret management (using Vault or cloud providers), data isolation, and defensive coding.
- [INDIRECT_PROMPT_INJECTION]: The skill defines a significant attack surface as its primary purpose is ingesting untrusted data (PDFs, images, Excel files, and external API responses).
- Ingestion points: External file uploads (PDF, Image, Excel) and third-party API payloads (referenced in SKILL.md and references/pdf-extraction.md).
- Boundary markers: The skill mandates strict validation boundaries using Pydantic models (references/validation-and-deadletter.md).
- Capability inventory: The framework utilizes
subprocess.runfor system tools (OCR, scanning),sqlalchemyfor database operations, andhttpx/stripefor network communication. - Sanitization: Proactive sanitization is implemented, including EXIF stripping (piexif/Pillow), PDF sanitization (qpdf), and strict MIME-type validation (python-magic).
- [COMMAND_EXECUTION]: System tools like
ocrmypdf,pdftoppm, andclamdscanare invoked viasubprocess.runin references/pdf-extraction.md. These are implemented using list-based arguments for data processing and security scanning, adhering to safe execution patterns.
Audit Metadata