realtime-cinema-data-engineering-pipeline
Warn
Audited by Gen Agent Trust Hub on May 25, 2026
Risk Level: MEDIUMEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONCREDENTIALS_UNSAFE
Full Analysis
- [EXTERNAL_DOWNLOADS]: The installation guide instructs users to clone a repository from an untrusted third-party GitHub account (BaidaneAyoub/realtime-cinema-data-engineering.git). \n- [COMMAND_EXECUTION]: The skill provides instructions to execute several shell commands, including git clone, python environment setup, pip dependency installation, and the execution of the downloaded Python scripts (main_producer.py, main_consumer.py). \n- [CREDENTIALS_UNSAFE]: The documentation and code snippets use default credentials (e.g., 'postgres:postgres') for database access. While these serve as placeholders in a tutorial context, they represent insecure defaults. \n- [PROMPT_INJECTION]: The skill implements an architecture that processes external, untrusted JSON data from a Kafka stream, creating a surface for indirect prompt injection. \n
- Ingestion points: Untrusted data enters the pipeline via the KafkaConsumer in 'consumer/main_consumer.py'. \n
- Boundary markers: None identified in the Kafka event processing logic. \n
- Capability inventory: The skill has the capability to perform database writes (PostgreSQL) and orchestrate tasks via Apache Airflow. \n
- Sanitization: The code uses parameterized SQL queries (psycopg2) which protects against SQL injection, but the system remains vulnerable to logical indirect injection via the processed JSON payloads.
Audit Metadata