enterprise-data-engineering-pipeline-ssis-pyspark
Installation
SKILL.md
Enterprise Data Engineering Pipeline (SSIS + PySpark)
Skill by ara.so — Data Skills collection.
Overview
This project provides a complete enterprise data engineering solution that combines:
- SSIS (SQL Server Integration Services) for ETL orchestration
- SQL Server with Star Schema data warehouse design (fact and dimension tables)
- Python (Pandas) for data quality audits and visualization
- PySpark for big data analytics and aggregation
The pipeline ingests raw CSV files (Sales, Products, Customers), transforms them through SSIS, loads into a dimensional model, and performs analytics at scale.