Purpose

dbt (data build tool) is a command-line tool for transforming data in warehouses using SQL-based models. It enables developers to write, test, and document data transformations, ensuring reliable ETL processes.

When to Use

Use dbt when building SQL data models for warehouses like Snowflake, BigQuery, or Redshift. Apply it for incremental data loading, schema evolution, or automated testing in data pipelines. Avoid it for real-time processing or non-SQL data sources; opt for dbt when you need version-controlled SQL code with built-in validation.

Key Capabilities

Define reusable SQL models in .sql files with Jinja templating for dynamic queries (e.g., {{ var('date') }} for parameter injection).
Run automated tests like schema checks or custom assertions via YAML configs (e.g., not_null or unique tests).
Generate documentation automatically from models using dbt docs generate, outputting HTML with model dependencies and descriptions.
Handle incremental models with the is_incremental() macro to process only new data, reducing warehouse load.
Support for macros and packages via dbt hub for extending functionality, like adding utility functions.

Usage Patterns

Follow this workflow: 1) Initialize a project with dbt init. 2) Write models in the models/ directory as SQL files. 3) Configure connections in profiles.yml. 4) Run and test models iteratively. 5) Use seeds for static data and snapshots for slowly changing dimensions. For CI/CD, integrate dbt into scripts: run dbt run in a Docker container with mounted volumes. Always specify targets like --target dev to switch environments.