6 Comments
User's avatar
Dennys's avatar

what if you have more than 400 DBT models? is the DBTDag still the best option?

Is better to have 400 tasks instead of just 3?

Expand full comment
Oleg Agapov's avatar

You could try DbtTaskGroup to group them together. Plus, you can split dbt project into several DAGs (e.g. by a domain).

But if you feel more comfortable with three bash operators – it's totally fine. I had this setup and it worked well.

Expand full comment
Dennys's avatar

Do you know if DBTDag implementation is equivalent to running `dbt run --models <model>` for each model traversing the graph or there is some parallelization happening behind the scenes?

If so, each tasks will occupy one slot in Airflow so if you have multiple pipelines running at the same time sharing the same pool your pipelines will start slow down waiting for a free slot, isn't it?

Expand full comment
Алексей Смирнов's avatar

Nice article. What about running in k8s using docker and pod operator?

Expand full comment
Oleg Agapov's avatar

Sure, it totally possible if you are using kubernetes! I suppose it's gonna be similar to BashOperator, since you are going to run dbt as a CLI command.

Expand full comment
Алексей Смирнов's avatar

I am just thinking that it's missing in the article. And this is the way how we use it. Probably the most stable for bigger DBA projects with Airflow ig not to use cloud

Expand full comment