data-engineering

Installation
SKILL.md

Data Engineering Hub

Welcome to the comprehensive data engineering skill suite. This hub organizes all data engineering knowledge into logical, non-overlapping domains.

Skill Map

Domain Skills When to Use
Core @data-engineering-core Polars, DuckDB, PyArrow fundamentals; ETL patterns; error handling; performance optimization
Storage @data-engineering-storage-lakehouse Delta Lake, Apache Iceberg, Apache Hudi
@data-engineering-storage-remote-access fsspec, pyarrow.fs, obstore; cloud access patterns
@data-engineering-storage-authentication AWS, GCP, Azure auth - IAM roles, managed identity, secrets management
@data-engineering-storage-formats Parquet optimizations, Lance, Zarr, Avro, ORC
Orchestration @data-engineering-orchestration Prefect, Dagster, dbt, workflow scheduling
Streaming @data-engineering-streaming Kafka, MQTT, NATS JetStream for real-time data
Quality @data-engineering-quality Great Expectations, Pandera for data validation
Observability @data-engineering-observability OpenTelemetry, Prometheus for pipeline monitoring
AI/ML @data-engineering-ai-ml Embeddings, vector databases, RAG pipelines
Best Practices @data-engineering-best-practices Medallion architecture, partitioning, file sizing, incremental loads, schema evolution, testing
Related skills

More from legout/data-platform-agent-skills

Installs
6
First Seen
Feb 11, 2026