enterprise-data-engineering-pipeline-ssis-pyspark

Installation
SKILL.md

Enterprise Data Engineering Pipeline (SSIS + PySpark)

Skill by ara.so — Data Skills collection.

Overview

This project provides a complete enterprise data engineering solution that combines:

  • SSIS (SQL Server Integration Services) for ETL orchestration
  • SQL Server with Star Schema data warehouse design (fact and dimension tables)
  • Python (Pandas) for data quality audits and visualization
  • PySpark for big data analytics and aggregation

The pipeline ingests raw CSV files (Sales, Products, Customers), transforms them through SSIS, loads into a dimensional model, and performs analytics at scale.

Architecture Components

Installs
347
GitHub Stars
1
First Seen
May 23, 2026
enterprise-data-engineering-pipeline-ssis-pyspark — aradotso/data-skills