SQL Transformation

dbt

When clients need open-source portability or are already on a dbt workflow, we bring the same rigor — tested models, version control, and documentation as code.

What It Is

dbt (data build tool) is the open-source standard for SQL-based data transformation. It compiles templated SQL into warehouse queries, manages dependencies, runs tests, and generates documentation automatically. Works with BigQuery, Snowflake, Redshift, and others. Recent releases add dbt Copilot (AI SQL assistance, GA), the Fusion engine (Rust-based, much faster than Core, still in beta), and a GA Semantic Layer powered by MetricFlow.

Why We Chose It

dbt is the most widely adopted transformation tool in modern data engineering. When clients have existing dbt projects, multi-cloud requirements, or teams with established dbt expertise, we work within that ecosystem rather than pushing a migration. dbt Cloud also offers a fully managed workflow that mirrors what Dataform provides natively on GCP.

One caveat worth noting: Fivetran and dbt Labs merged in October 2025. That consolidation is convenient for clients on the Fivetran + dbt path, but it's worth staying aware of vendor-lock-in risk — pairing Airbyte with Dataform (or dbt Core) keeps the ingestion and transformation layers in separate hands.

How We Use It

Build and restructure dbt projects following the staging → intermediate → marts layer convention

Write Jinja-templated SQL macros for reusable transformation logic

Implement dbt tests (not null, unique, referential integrity, custom) across critical models

Configure dbt documentation and lineage graphs so new team members understand data flow

Set up dbt Cloud jobs with CI/CD integration — run tests on PRs before merging

Advise on Dataform vs. dbt tradeoffs for greenfield GCP projects

dbt or Dataform?

Choose dbt when:

  • You have existing dbt investment (projects, CI/CD, team knowledge)
  • You need multi-cloud portability (BigQuery + Snowflake, or moving later)
  • Your team already works in dbt every day

Choose dbt Cloud when:

  • You want managed CI/CD, documentation hosting, and the Metrics Layer without DIY
  • You're comfortable with the price (starts around $100/developer/month)

Choose dbt Core when:

  • Your budget rules out dbt Cloud
  • You're willing to run orchestration yourself (usually Dagster or GitHub Actions)

Choose Dataform instead when:

  • You're on GCP only, with no multi-cloud plans
  • Your team is new to analytics engineering
  • You want to skip the orchestration layer entirely