data-lake-platform

Installation
SKILL.md

Data Lake Platform

Build and operate production data lakes and lakehouses: ingest, transform, store in open formats, and serve analytics reliably.

When to Use

  • Design data lake/lakehouse architecture
  • Set up ingestion pipelines (batch, incremental, CDC)
  • Build SQL transformation layers (SQLMesh, dbt)
  • Choose table formats and catalogs (Iceberg, Delta, Hudi)
  • Deploy query/serving engines (Trino, ClickHouse, DuckDB)
  • Implement streaming pipelines (Kafka, Flink)
  • Set up orchestration (Dagster, Airflow, Prefect)
  • Add governance, lineage, data quality, and cost controls

Triage Questions

  1. Batch, streaming, or hybrid? What is the freshness SLO?
  2. Append-only vs upserts/deletes (CDC)? Is time travel required?
Related skills
Installs
96
GitHub Stars
60
First Seen
Jan 23, 2026