Why Column-Level Lineage Matters for dbt: Comparing the Options
Column-level lineage was a hot topic during 2024, with many data platforms and tools adding this feature to their software and boasting the benefits of column-level lineage for data workflows.
In this article, I’ll take a look at how column-level lineage works in dbt, in both open-source and dbt Cloud, and through SQLMesh; and see how the usage and workflow differs between these platforms.

Seeing lineage in your transformation platform, especially during data modeling updates, is particularly important because it’s at this stage that you most need to understand the impact of your actions.
Why model and column-level lineage matter
Model-level lineage is already extremely valuable. It’s the main interface for performing data validation in Recce (through the unique lineage diff view), and provides a holistic overview of your data project which supports the analysis of data flow for root cause and data impact assessment work.
As data projects become more complex, the use of lineage DAGs for visualization becomes more and more important. If you’re making a change to a critical model that is used in many downstream metrics, or a bottleneck model with 100s of downstream, being able to actually see those models is invaluable.

While model-level lineage gives you a high-level view of data flow, it doesn’t tell you how specific columns are evolving through the data project in transformations — that’s where column-level lineages comes in.
Take it to the next (column) level
Column-level lineage takes it to the next level, literally, by providing a more in-depth look at how data moves through your data project. Tracking the evolution of a column, and the data it contains, helps you to analyze exactly why the numbers are the way they are, and where they’re going next.
Common use-cases for column-level lineage are:
- Source Exploration: During development, column-level lineage helps you understand how a column is derived.
- Impact Analysis: When modifying the logic of a column, column-level lineage enables you to assess the potential impact across the entire DAG.
- Root Cause Analysis: Column-level lineage helps identify the possible source of errors by tracing data lineage at the column level.
Rather than use pros and cons for each platform, I’ve highlighted the key points for each platform. This is because your specific use-case will determine whether these points are pros or cons.
Let’s dive in.
dbt column-level lineage
There are two main ways to access column-level lineage for dbt:
- The open-source Power User for dbt (VSCode plugin)
- dbt Cloud as part of dbt Explorer (for dbt Cloud Enterprise customers)
Let’s take a look at how column-level lineage is used via each of these methods:
Power User for dbt (VSCode dbt power user)
Power User for dbt is an open-source VSCode extension and, as the name suggests, brings advanced dbt-related features right into your code editor. Both model and column-level lineage are available and can be viewed in a VSCode panel.

The model-level lineage shows the node type, and also has a high degree of control over the number of parent and child nodes you want to show.

How to enable column-level lineage in dbt Power user
To use column-level lineage you currently need to enable the beta version by clicking the Show New UX (Beta)
button. After that, you can view column-level lineage by:
- Clicking a node in the lineage
- Clicking a column in the details panel

The lineage also shows if the column was transformed, and you can view the code transformations that took place.
If you use VSCode, having model and column-level lineage right is a huge convenience, especially as the column-level lineage is shown as part of the model lineage!
Highlights of column-level lineage in dbt Power User
- Open-source and free to use (An API key is required from Altimate)
- Runs in VSCode so helps you stay on-task
- Part of a full-featured lineage feature
dbt Explorer column-level lineage in dbt Cloud
Column-level lineage in dbt Cloud is available in dbt Explorer, which is part of dbt’s Enterprise offering. dbt positions the column-level lineage feature as a way to understand the lay of the land before making any changes to your project.
A lens on lineage
From the model-level lineage, which contains lineage ‘lens’ to toggle various views, you can select a model and view the column evolution. As column-level lineage is viewed separately from the model lineage, it reduces the speed at which you can inspect multiple column lineages because you have to go in and out of the model details pages.
Columns become nodes
Also of note is that rather than group columns by model, each column is shown as a distinct node on the lineage. For instance, in the screenshot below, there are two nodes used to represent the columns EXTENDED_PRICE
and QUANTITY
that are both from the one model, stg_tphc_line_items
. As mentioned above, your personal preference will determine the usability of this implementation.

Highlights of column-level lineage in dbt Explorer
- Only available to dbt Enterprise customers
- Column-level lineage is shown separately from the model-level lineage
- Columns are shown as distinct nodes in the lineage
Recce: column-level lineage for data model validation
Open-source data-model validation tool, Recce, also recently added column lineage as part of its impact assessment workflow.

Column-lineage in Recce is shown directly on the lineage DAG (in Recce it’s actually a lineage DAG diff), so you can easily click about and inspect the lineage of various columns, while staying on the task of checking which models are impacted by your data modeling changes. Each column also displays the type of transformation that has taken place, such as derived (transformed), renamed, or simply passed-through.
As mentioned, Recce positions the column-lineage as a method to help with impact assessment, which makes sense when you have a bunch of models to check after updating an upstream model — knowing which models use that column can be very useful.
Highlights of column-level lineage in Recce
- Open-source version is free to use
- Column information is displayed on the main lineage DAG
- Part of a larger data-validation workflow
SQLMesh column-level lineage
SQLMesh, the data transformation platform that quickly emerged as the primary competitor to dbt, integrates many features into it’s open-source offering.
In addition to data transformation, impact analysis, audits, blue/green deployments, column-level lineage is also an integral part of the platform. Notably, SQLMesh actually enables column-level lineage for dbt projects.

SQLMesh runs in a browser, and accessing the related lineage for each model is as easy as clicking the model in the catalog or file view.
The SQLMesh column-level lineage is unique from the other implementations, and more opinionated, for a few reasons.
CTEs become nodes
The first reason is that CTEs are displayed as nodes in the column-level view. It’s a interesting choice, but could result in information overload if your DAG is very busy.

Heads-up(stream)
The second is that only the upstream lineage is shown. For instance, in the screenshot below only the upstream column usage of the amount
column is shown, the downstream is not.

This is why I described this as a more opinionated implementation. I suppose the expectation is — you’re looking at a number and you want to know how that number came to be, so naturally you want to look upstream.
Refresh, refresh, refresh
One issue with navigating lineage in SQLMesh is the constant reloading. Each time you click a node the whole lineage is refreshed. This makes it incredibly difficult to assess lineage, while at the same time keeping track of where you are in the bigger picture.

This refresh remains for going from project to column-level lineage. Again, making it possible to lose that thread.
Highlights of Column-level lineage in SQLMesh
- Native feature of SQLMesh
- Open-source
- The lineage reloads when selecting a node which can be distracting
- Only the upstream lineage is shown
- CTEs are shown on the lineage as nodes
Additional considerations
Just tracking how data flows through columns is useful, but there are other considerations to take into account when exploring data impact. If and how a column was transformed can offer greater insight into which nodes you need to focus on for root cause analysis. Also, keep in mind that columns used in filtering or joins may not appear in column-level lineage as being ‘used’ in that node.
Column transformation type
One aspect of column-level lineage that is particularly useful for root cause analysis is knowing if a transformation occurred in the model.
When the column passed through this model, what happened to it?
- Transformed
- Passed-through
- Renamed

If the column was transformed, then having the ability to view those transformations can further enable your work understanding more about a columns evolution.
“Where” is the caveat
If a column is used downstream in a WHERE clause, or join, this does not count as the column being used in that model. Filtering and join conditions establish model relationships, but do not modify or transform the column data. Therefore, they will not be shown on the column-level lineage in most implementations.
Conclusion
Each of the column-level lineage offerings above differs in slight ways that would inform the way you use them. In that regard, which is most useful to you would depend on your particular use-case. Implementation aside, there’s no denying the use of column-level lineage as a tool in impact assessment both before and after modeling changes, and in gaining a deep understanding of data flow through your data project.
Are you using one of the many column-level lineage tools? Which one are you using, is it good? Anything to correct in the summaries above? Please leave a comment!