Create dbt Documentation Everyone Loves
Just a few tricks that will make your documentation look stunning
One of the things I appreciate about dbt is the ability to generate documentation for my models. I find it helpful that the code and corresponding documentation can exist together in the same codebase. This makes it simpler to create and maintain.
The default documentation generated by dbt is a good starting point. It provides a foundation, including a list of models, relationships between models, and a visual DAG with dependencies. However, we can enhance it further with a few simple tweaks.
Let me share my favorite recipes for creating dbt documentation that everyone will love to use.
Upgrade your starting page
The default starting page of dbt docs provides very generic information. It can be considered as a placeholder, because it does not contain any information about your project. So it is better to change it to something more useful.
Create a file called homepage.md inside of /models folder with the following code:
{% docs __overview__ %}
# My dbt project
This is a home page for my dbt project.
Link to [Github](<https://github.com>)
Link to [Jira](<https://jira.com>)
{% enddocs %}
Inside of “docs __overview__” macro you can put any Markdown-like text that describes your project. This can include a general description of your project, useful links to other services, and any other onboarding materials.
Hide non-useful content
If you have a fairly large dbt project, you may find that the data sources and projects in the sidebar look messy and are hard to navigate. It can be especially difficult for people who do not frequently view your documentation, such as managers.
To clear the clutter in the elements tree, you can hide unnecessary items. This includes all types of dbt resources (models, seeds, sources, etc.) and external packages:
models:
- name: stg_mysql__users
docs:
show: false
Similarly, you can hide folders and packages in dbt_project.yml:
models:
my_dbt_project:
staging: # Hide "staging" folder
+docs:
show: false
dbt_utils: # Hide "dbt_utils" package
+docs:
show: false
Hidden elements will not be visible in the rendered documentation, but will still be visible in the visual DAG (they will be marked as hidden).
Use doc blocks to keep it DRY (Don't Repeat Yourself)
One common problem with documentation is that it can become repetitive very quickly. For instance, the same column may be used in many downstream models, and copying and pasting its description can become tedious.
One solution is to use the dbt_codegen package. With the generate_model_yaml macro, you can generate a YAML schema for your models and pre-populate it with descriptions from upstream models, provided that identical columns exist and have a description:
dbt run-operation generate_model_yaml --args '{"model_names": ["dim_customers"], "upstream_descriptions": True}'
While the described solution is functional, it still has its flaws. If, for instance, you need to modify the definition of a column, you must search for all instances where the column is referenced. This violates the DRY (Don't Repeat Yourself) principle.
Better solution is to use doc blocks. They allow to write a documentation in Markdown files and then reference that documentation in YAML schemas. Create a markdown file /models/docs.md (you can choose any name) and create a doc block using Jinja like this:
{% docs user_id %}
Unique identifier of a user in our system.
{% enddocs %}
Then simple use this block in a YAML schema:
models:
- name: dim_users
columns:
- name: user_id
description: "{{ doc('user_id') }}"
You can see here that the column description is not a hardcoded string, but rather a reference to a doc block called “user_id” using a built-in macro doc(). This approach enables you to define your documentation once, in one place, and then reuse it wherever needed.
Color code nodes
Did you know that you can change the color of nodes in the visualization graph?
Now you do! 🙂
You can change color for the whole folder of models, e.g. by defining it in dbt_project.yml:
models:
my_dbt_project:
staging:
+docs:
node_color: "#e6b530"
Or even set it for each particular model, like so:
{{
config(
docs={'node_color': '#682adb'}
)
}}
This will help you set your own color-coding for the DAG, making your documentation fancier, more informative, and pleasant to look at!
That’s a wrap! I hope that you find those tips useful and will apply to your documentation to make it even better!
If you liked this issue please subscribe and share it with your colleagues, this greatly helps me developing this newsletter!
See you next time 👋
Thanks for this. Nice to clean up the home page, hadn't realized I could. I'm curious how you got a specific logo to appear on the documentation home page.
This may be related to Artur's question - but I've seen multiple references in the docs + other articles about using the +docs config to hide packages from the project's documentation. I've tried the format you suggested along with the docs and have not been able to get it to work.
By default, my dbt_packages directory is in the outer most directory of my project, so I'm not sure if I need to move that to be under my "models" directory to get the above config to work? Any suggestions?