Dagster
Orchestration for when pipelines get complex enough that “scheduled to run every morning” stops being good enough.
What It Is
Dagster is a modern data orchestrator — the tool that sits on top of your pipelines and transformations to schedule them, handle dependencies, retry failures, and give you visibility into what ran, when, and why something broke. It competes with Airflow and Prefect, with a stronger focus on data quality and developer experience. Recent releases add Dagster Components (YAML-first pipeline authoring, friendly to AI coding agents) and an MCP server for direct integration with tools like Claude Code and Cursor.
Why We Chose It
Most of our projects don't need orchestration — Airbyte and Dataform both schedule themselves, and for simple stacks adding an orchestrator is over-engineering. When orchestration is genuinely needed (custom scripts with dependencies, complex data quality checks, cross-pipeline coordination), Dagster is the clearest improvement over Airflow. The software engineering quality is noticeably higher, the UI is actually usable, and the asset-based model matches how data teams think about pipelines.
How We Use It
Orchestrate multi-step custom Python scripts with explicit dependencies and data quality checks between steps
Define and monitor "assets" — the actual tables and files your pipeline produces, not just the jobs that build them
Coordinate dbt Core runs alongside custom pipelines when dbt Cloud's scheduler isn't enough
Use Dagster Components and the dg CLI for YAML-driven pipeline definitions — readable for humans and friendly to AI coding agents
Set up Dagster Cloud for managed hosting when clients don't want to maintain GKE infrastructure
Integrate Dagster alerts with Slack, PagerDuty, or email so failures surface in the tools your team already watches
When Dagster is the right orchestrator — and when it isn't
Choose Dagster when:
- You have 4+ custom pipelines with cross-pipeline dependencies
- You need data quality checks as first-class pipeline elements
- You want the best developer experience in orchestration
Choose no orchestrator when:
- Your stack is Airbyte + Dataform + a BI tool — each tool self-schedules
- You don't have custom pipelines that need cross-tool coordination
Choose Cloud Functions + Cloud Scheduler when:
- You have 1–3 simple scheduled scripts, each independent
- You don't need dependency management between them
Choose Airflow when:
- You're already in an Airflow shop with operational investment
- The cost of switching outweighs Dagster's advantages