Recce State File
Introduction
The state file represents the serialized state of a Recce instance. It is a JSON-formatted file containing the following information:
- Checks: Data from the checks added to the checklist on the Checklist page.
- Runs: Each diff execution in Recce corresponds to a run, similar to a query in a data warehouse. Typically, a single run submits a series of queries to the warehouse and retrieves the final results.
- Environment Artifacts: Includes
manifest.json
andcatalog.json
files for both the base and current environments. - Runtime Information: Metadata such as Git branch details and pull request (PR) information from the CI runner.
How to Save the State File
There are multiple ways to save the state file.
-
Save from the Web UI: Click the Save button at the top of the app. Recce will continuously write updates to the state file, effectively working like an auto-save feature, and persist the state until the Recce instance is closed. The file is saved with the specified filename in the directory where the recce server command is run.
-
Export from the Web UI: Click the Export button located in the top-right corner to download the current Recce state to any location on your machine.
-
Start Recce from a State File: You can provide a state file as an argument when launching Recce. If the file does not exist, Recce will create a state file and start with an empty state. If the file exists, Recce will load the state and continue working from it.
-
Use the
run
Command: For more complex dbt projects with a CI/CD process, where dbt transformations are executed and results are placed in a PR-specific environment, you can integrate the recce run command into your workflow. This allows reviewers to easily audit results and decide whether a merge can proceed.
How to Use the State File
The state file can be used in several ways:
- Continue the state: Launch Recce with the specified state file.
- Review the state: Running Recce with the
--review
option enables review mode. In this mode, Recce uses the dbt artifacts in the state file instead of those in thetarget/
andtarget-base/
directories. This option is useful for distinguishing between development and review purposes. - Import checklist from file: To preserve favorite checks across different branches, you can import a checklist by clicking the Import button at the top of the checklist.
- Continue the state from
recce run
: This will execute the checks in the specified state file.
Scenario: Development
In the development workflow, the state file acts as a session for developing a feature. It allows you to store checks to verify the diff results against the base environment.
Common development workflow:
- Run the recce server without a state file.
- Add checks to the checklist.
- Save the state by clicking the Save or Export button.
- Resume your session by launching Recce with the specific state file.
Scenario: PR Review
During the PR review process, the state file serves as a communication medium between the submitter and the reviewer.
- Start the Recce server without a state file.
- Add checks to the checklist.
- Save the state by clicking the Save or Export button.
- Share the state file with the reviewer or attach it as a comment in the pull request.
-
The reviewer reviews the results using the state file
Recce Cloud
Note
Currently, Recce Cloud is still under development. If you are interested, please sign up for a Recce Cloud invite or contact us in the dbt slack #tool-recce channel
Although a state file can store the state, it is not very suitable for recording the latest review status of a PR. Especially since a PR may include the submitter, reviewer, and the automated processes in the CI workflow. The purpose of Recce Cloud is to solve the PR review status management issue.
Prerequisites
Before uploading the state file to Recce Cloud, you need to define a password to encrypt the state file. The password is used to encrypt the state file before uploading it to Recce Cloud. It will also be used to decrypt the state file when you download it from Recce Cloud. The password is not stored in Recce Cloud, so you need to keep it safe.
PR Review Workflow
This is a most common workflow: the submitter pushes commits to the GitHub PR, and the reviewer is responsible for reviewing/auditing the PR. This includes checking code changes, ensuring requirements are met, and assessing whether there is any impact on existing models.
As a submitter
- Push changes to the remote
- Create PR for review
- Run dbt to prepare the PR review environment
- Launch recce server for this PR branch
- Add recce checks for review
- Leave description and screenshots in the PR comments
As a reviewer
- Checkout the PR branch
- Launch recce server for this PR branch in the review mode
- If all checks are good, mark all checks as Approved. Otherwise, leave comment in the PR comment or the recce check description.
PR Review Workflow with CI
For more mature projects, we introduce CI automation to standardize the process and reduce human-caused variability or errors. In the workflow, we will do the following two things in the CI:
Execute dbt to create a PR environment. Execute recce to update dbt artifacts, rerun check runs, and update the PR status to Recce Cloud.
As a submitter
- Push changes to the remote
- Create PR for review
In the CI workflow of the PR push event
- The github action workflow is triggered by the push event
- Checkout the PR branch
- Fetch the dbt artifacts for the base environment
- Run dbt for the PR environment
- Run recce for the current PR and upload state to the recce cloud.
As a submitter and reviewer, collaborate the state in the review mode recce server
- Checkout the PR branch
- Launch recce server for this PR branch in the review mode