datatalks-data-engineering-zoomcamp
Fail
Audited by Snyk on May 16, 2026
Risk Level: HIGH
Full Analysis
HIGH W007: Insecure credential handling detected in skill instructions.
- Insecure credential handling detected (high risk: 1.00). The prompt contains hardcoded plaintext credentials and passwords embedded directly in commands and config examples (e.g., POSTGRES_PASSWORD=root, PGADMIN_DEFAULT_PASSWORD=root, --password=root), which are insecure patterns that would require reproducing secret values verbatim.
MEDIUM W011: Third-party content exposure detected (indirect prompt injection risk).
- Third-party content exposure detected (high risk: 0.80). The SKILL.md explicitly downloads and ingests public, user-controlled data (e.g., wget of https://github.com/DataTalksClub/nyc-tlc-data/... in the Kestra workflow and the ingest_data.py --url parameter, plus public gs:// URIs) so the agent will read and process untrusted third-party content at runtime.
MEDIUM W012: Unverifiable external dependency detected (runtime URL that controls agent).
- Potentially malicious external URL detected (high risk: 0.80). The skill explicitly downloads and installs remote executables used at runtime—e.g., wget https://archive.apache.org/dist/spark/spark-3.3.2/spark-3.3.2-bin-hadoop3.tgz and git clone https://github.com/DataTalksClub/data-engineering-zoomcamp.git—which fetch remote code that will be executed/used as required dependencies.
Issues (3)
W007
HIGHInsecure credential handling detected in skill instructions.
W011
MEDIUMThird-party content exposure detected (indirect prompt injection risk).
W012
MEDIUMUnverifiable external dependency detected (runtime URL that controls agent).
Audit Metadata