python-data-pipelines

Pass

Audited by Gen Agent Trust Hub on May 14, 2026

Risk Level: SAFE
Full Analysis
  • [SAFE]: The skill represents a benchmark for secure engineering practices. It provides thorough guidance on secret management (using Vault or cloud providers), data isolation, and defensive coding.
  • [INDIRECT_PROMPT_INJECTION]: The skill defines a significant attack surface as its primary purpose is ingesting untrusted data (PDFs, images, Excel files, and external API responses).
  • Ingestion points: External file uploads (PDF, Image, Excel) and third-party API payloads (referenced in SKILL.md and references/pdf-extraction.md).
  • Boundary markers: The skill mandates strict validation boundaries using Pydantic models (references/validation-and-deadletter.md).
  • Capability inventory: The framework utilizes subprocess.run for system tools (OCR, scanning), sqlalchemy for database operations, and httpx/stripe for network communication.
  • Sanitization: Proactive sanitization is implemented, including EXIF stripping (piexif/Pillow), PDF sanitization (qpdf), and strict MIME-type validation (python-magic).
  • [COMMAND_EXECUTION]: System tools like ocrmypdf, pdftoppm, and clamdscan are invoked via subprocess.run in references/pdf-extraction.md. These are implemented using list-based arguments for data processing and security scanning, adhering to safe execution patterns.
Audit Metadata
Risk Level
SAFE
Analyzed
May 14, 2026, 11:14 AM
Security Audit — agent-trust-hub — python-data-pipelines