building-data-pipelines

Installation
SKILL.md

Building Data Pipelines

Build robust, efficient batch data pipelines in Python. This skill covers the complete pipeline lifecycle: extracting data from sources, transforming with DataFrames or SQL, loading to destinations, and operating with production standards.

When to use this skill

Use this skill when:

  • Building ETL/ELT pipelines in Python
  • Choosing between Polars, DuckDB, PyArrow, or SQL for data processing
  • Designing data layer architecture (Bronze/Silver/Gold)
  • Implementing incremental loading with watermarks or CDC
  • Deciding on append vs overwrite vs merge semantics
  • Setting up partitioning and file sizing strategies
  • Validating data quality at pipeline boundaries

When not to use this skill

Installs
2
First Seen
Apr 11, 2026
building-data-pipelines — legout/data-platform-agent-skills