Lineage
Lineage page the main interface to Recce and how you can quickly determine the zone of impact of any modeling changes.
Lineage Diff
It's from Lineage Diff that you will determine which models to investigate further to validate your changes.
Node Summary
- Models are color coded to indicate
added
,removed
, andmodified
models. - The bottom icon indicates if there is
row count changed
orschema changed
detected. A row count changed icon is only shown if there is row count diff executed on this node. - Click a model to view the Node detail and perform other checks.
Filter Nodes
In the top control bar, you can change the rule ot filter the nodes
- View Mode:
- Changed Models: Modified nodes and their downstream + 1st degree of their parents.
- All: Show all nodes.
- Package: Filter by dbt package names.
- Select: Select nodes by node selection
- Exclude: Exclude nodes by node selection
Select Nodes
By clicking the Select nodes button at the top-right corner, you can select multiple nodes for further operations. For detail, see the Multi Nodes Selections section
Row Count Diff
You can run the row count diff on the selected node (selected by select
and exclude
)
To run the row count diff by selector
- Click the ... button at the top-right corner.
- Click on Row Count Diff by Selector.
Node Detail
Schema Diff
Note
Schema Diff requires catalog.json
in both environments.
Schema Diff shows added, removed, and renamed columns. Click a model in the Lineage DAG Diff to view the Schema Diff.
Row Count Diff
Row Count Diff shows the difference in row count between the base and current environments.
Code Diff
- Select the model from the Lineage DAG.
- Click the Diff button on the upper right corner.
Value Diff
Note
Value Diff uses the compare_column_values
from audit-helper. To use Value Diff, ensure that audit-helper
is installed in your project.
Value Diff shows the matched count and percentage for each columns in the table. It use the primary key(s) to uniquely identify the records between the model in both environments.
The primary key is automatically inferenced by the first column with the unique test. If no primary key is detected, at least one column required to be specified as primary key.
- Added: Newly added PKs.
- Removed: Removed PKs.
- Matched: For a column, the count of matched value of common PKs.
- Matched %: For a column, the ratio of matched over common PKs.
You can query all the diff records from the value diff result.
Profile Diff
Note
Profile diff uses the get_profile
from dbt-profiler. To use Profile Diff, ensure that dbt-profiler is installed in your project.
Profile Diff compare the basic statistic (e.g. count, distinct count, min, max, average) for each columns between two environments.
- Select the model from the Lineage DAG.
- Click the Advanced Diffs button
Please reference dbt-profiler for the definition of the profiling stats.
Histogram Diff
Histogram Diff compares the distribution of a numeric column in an overlay histogram chart.
- Select the model from the Lineage DAG.
- Click the
Advanced Diffs
buton and selectHistogram Diff
. - Select the column to diff and click
Execute
.
Top-K Diff
Top-K Diff compares the distribution of a categorical column. The top 10 elements are shown by default. This can be expanded to the the top 50 elements.
- Select the model from the Lineage DAG.
- Click the
Advanced Diffs
buton and selectTop-K Diff
. - Select the column to diff and click
Execute
.
Multi Nodes Selection
Select Nodes
- Click the Select nodes button
- Select one or more nodes
- or right click on a nodes, you can Select parent nodes or Select child nodes
- Click the action in the multi select control bar.
Row Count Diff
Row Count Diff shows the difference in row count between the base and current environments.
Value Diff
Screenshot
In the diff result, we can find a Copy to Clipboard button. it's a handy feature to copy the result image to clipboard and paste in your PR comment.
Note
FireFox does not support to copy image to clipboard. Recce show a modal instead. You can download the image to local or right-click on the image to copy the image.
Add to Checklist
In the lineage page, we can run different type of check. However, for these reason we would like to add to checklist
- Keep the check and I can rerun this after my code change
- Add my result and interpretation for review purpose
Lineage Diff
Lineage diff by selector
- Select nodes by
Select
andExclude
on the top control. - Click ... at the top-right corner
- Click the Lineage diff
Lineage diff by multi nodes selection
- Click Select nodes button at the top-right corner
- Select nodes
- Click the Add lineage diff check button
Schema Diff
Schema diff by node selector
- Select nodes by
Select
andExclude
on the top control. - Click ... at the top-right corner
- Click the Schema diff button
Schema diff by multi nodes selection
- Click Select nodes button at the top-right corner
- Select nodes
- Click the Add schema check button
Schema diff for single node
- Select a node, then the node detail would show.
- Click Add check button on the node detail pane.
- Click Schema check
Row Count Diff
Row count diff by node selector
- Select nodes by
Select
andExclude
on the top control. - Click ... at the top-right corner
- Click the Row Count Diff by Selctor, then it will run the row count diff
- Click the Add to checklist in the result page.
Row count diff by multi nodes selection
- Click Select nodes button
- Select nodes
- Click Row count diff, then it will run the row count diff
- Select a node, then the run result would show.
- Click Add to checklist
Other Diffs
- Execute the diff
- Click Add to checklist