dqx-patterns

Installation
SKILL.md

DQX Data Quality Framework Patterns

Overview

DQX is a Python-based data quality framework from Databricks Labs that validates PySpark DataFrames with richer diagnostics than standard DLT expectations. This skill provides production-grade patterns for integrating DQX into medallion architecture pipelines.

Recommended Version: >=0.12.0 (float support, outlier detection, JSON validation, AI-assisted rules)

Key Benefits:

  • Detailed diagnostic information (_error, _warning columns)
  • Flexible quarantine strategies (drop, mark, split)
  • Dataset-level checks (uniqueness, foreign keys, outliers, aggregations)
  • YAML/JSON/Delta/Lakebase check storage with governance
  • Auto-profiling and AI-assisted rule generation (0.10.0+)
  • Summary metrics for quality tracking over time (0.10.0+)

Quick Start (3-4 hours for pilot)

Goal: Add DQX diagnostics to one Silver table without disrupting existing DLT expectations.

Installs
2
GitHub Stars
2
First Seen
Mar 8, 2026
dqx-patterns — databricks-solutions/vibe-coding-workshop-template