SQL Transformation

Dataform

Dataform is our first choice for SQL transformation on GCP — it turns raw BigQuery data into production-grade, version-controlled analytics models.

What It Is

Dataform is Google's managed SQL transformation tool, natively integrated into the GCP console. It lets you write modular SQL workflows (SQLX files), define dependencies between tables, run tests on your data, and deploy changes through a Git-based workflow — all without leaving BigQuery. Gemini assistance is built in for SQLX generation and error fixing.

Why We Chose It

For GCP-native teams, Dataform is the lowest-friction path to well-structured SQL. No external orchestrator needed — Dataform handles scheduling, dependencies, and retries. The GCP integration means no credentials to manage and billing tied directly to your existing BigQuery setup.

How We Use It

Migrate spaghetti SQL scripts into structured Dataform projects with clear staging, intermediate, and output layers

Implement incremental models that only process new data, reducing BigQuery costs significantly

Write SQLX assertions and data tests to catch data quality issues before they reach dashboards

Set up scheduled releases tied to Git tags for controlled production deployments

Configure Dataform compilation variables for environment-specific logic (dev/staging/prod)

Dataform or dbt?

Choose Dataform when:

  • You're on GCP and don't need cross-cloud portability
  • Your team doesn't already have dbt investment
  • You want the simplest path from raw data to reporting layer
  • You value native BigQuery integration over portability

Choose dbt Cloud when:

  • You want best-in-class developer experience
  • Your team already works in dbt every day
  • You need portability to Snowflake or other warehouses down the line

Choose dbt Core when:

  • You want open-source dbt without dbt Cloud pricing
  • You're comfortable wiring up your own orchestration (usually Dagster or GitHub Actions)