databricks-iceberg

Installation
SKILL.md

Apache Iceberg on Databricks

Databricks provides multiple ways to work with Apache Iceberg: native managed Iceberg tables, UniForm for Delta-to-Iceberg interoperability, and the Iceberg REST Catalog (IRC) for external engine access.


Critical Rules (always follow)

  • MUST use Unity Catalog — all Iceberg features require UC-enabled workspaces
  • MUST NOT install an Iceberg library into Databricks Runtime (DBR includes built-in Iceberg support; adding a library causes version conflicts)
  • MUST NOT set write.metadata.path or write.metadata.previous-versions-max — Databricks manages metadata locations automatically; overriding causes corruption
  • MUST determine which Iceberg pattern fits the use case before writing code — see the When to Use section below
  • MUST know that both PARTITIONED BY and CLUSTER BY produce the same Iceberg metadata for external engines — UC maintains an Iceberg partition spec with partition fields corresponding to the clustering keys, so external engines reading via IRC see a partitioned Iceberg table (not Hive-style, but proper Iceberg partition fields) and can prune on those fields; internally UC uses those fields as liquid clustering keys; the only differences between the two syntaxes are: (1) PARTITIONED BY is standard Iceberg DDL (any engine can create the table), while CLUSTER BY is DBR-only DDL; (2) PARTITIONED BY auto-handles DV/row-tracking properties, while CLUSTER BY requires manual TBLPROPERTIES on v2
  • MUST NOT use expression-based partition transforms (bucket(), years(), months(), days(), hours()) with PARTITIONED BY on managed Iceberg tables — only plain column references are supported; expression transforms cause errors
  • MUST disable deletion vectors and row tracking when using CLUSTER BY on Iceberg v2 tables — set 'delta.enableDeletionVectors' = false and 'delta.enableRowTracking' = false in TBLPROPERTIES (Iceberg v3 handles this automatically; PARTITIONED BY handles this automatically on both v2 and v3)

Key Concepts

Installs
11
GitHub Stars
1.6K
First Seen
Feb 27, 2026
databricks-iceberg — databricks-solutions/ai-dev-kit