Demo Tutorial
Estimated Time: 20 minutes
Note
Recce Cloud is currently in private alpha and scheduled for general availability soon. Sign up to the Recce newsletter to be notified, or email product@datarecce.io to join our design partnership program for early access.
The following guide uses the official Jaffle Shop DuckDB project from dbt-labs, and provides everything you need to get started with Recce Cloud. By the end of the guide you'll be able to create and sync Recce checks with a GitHub PR via Recce Cloud.
To see what you'll get, check out the first section from the following Loom:
Clone Jaffle Shop to your own private repository
- Create a private repository in your GitHub account.
- Clone the Jaffle Shop DuckDB dbt data project:
- Change the remote url to the repository you just created:
- Push to your newly created repository:
Authorize Recce Cloud to access the repository
Recce Cloud needs access to your data project's repository in order to sync your checks status to the pull request thread.
- Visit Recce Cloud. If it is your first time logging in, click the Continue with Github button to authorize the Recce Cloud integration to access your GitHub account.
- Click the Install button to install the Recce Cloud GitHub app to your personal or organization account.
- On the app installation page, authorize Recce Cloud to access the repository you created in the previous section.
- Authorized repositories will then be shown in your Recce Cloud account.
Configure the Jaffle Shop DuckDB data project
Set up the Jaffle Shop project and install Recce.
- Create a new Python virtual env:
- Install the requirements and Recce:
- Add a production environment to the project by editing
./profiles.yml
and adding the following target: - Add the following packages required by Recce for some features (highly recommended). Create a
./packages.yml
file in the root of your project with the following packages: Install the packages:
Prepare the base environment
Recce requires to two environments to compare. The base
represents your point of reference (the known-good base), and target
represents your PR/development branch.
- Prepare production (base) environment. (Note the custom
--target-path
): - Add the
target-base/
folder to the.gitignore
file: - Remove the existing GitHub action workflows:
- Push the changes to remote:
Important
By default, Recce expects the dbt artifacts for the base environment to be located in a folder named target-base
.
The base environment preparation is now complete. The data in the prod
schema, and artifacts in the target-base
folder, represent stable (production) data.
As a PR author, you'll be working on data models, making changes to the project, and validating your work for correctness.
Prepare the review state for the PR
In this section, you'll make a new branch, update a data model, and create a pull request.
-
Checkout a branch:
-
Edit the staging model located in
./models/staging/stg_payments.sql
as follows: -
Run dbt on the development environment (the default target):
-
Commit the change:
-
Create a pull request for this branch in your GitHub repository.
Important
Don't forget to create a branch for the commit above, before continuing with this tutorial.
Launch a Recce Instance to validate your change
In this section, you will launch a Recce Instance, create validation checks, and sync those checks with Recce Cloud so they can be reviewed by your PR reviewer.
Prepare a GitHub Token and Recce State password
To access the repository, your local Recce Instance will require a GitHub Token (Classic).
- Prepare a GitHub Token (Classic) in your account. Ensure you provide
repo
permission for the new token. - Ensure you have configured these environment variables.
Run Recce in Cloud Mode
Run Recce instance in the cloud mode
Open the link to the Recce Instance in your browser. By default it should be http://0.0.0.0:8000
Create a Recce Check
Switch to the Query tab and paste the following query:
Enter the primary key asorder_id
and click the Run Diff
button.
1. Click the Add to Checklist
button to add the query result to your Checklist
1. On the Checklist
page you'll find that there are three checks. The Row count diff and Schema diff are default Preset Checks, and the Query diff is your newly added check. Leave the checks as unapproved.
1. Go back to the command line and terminate the Recce instance. Your Recce State file, containing your checklist and other artifacts will be encrypted and uploaded to Recce Cloud.
1. Go to the PR page in your GitHub repository and scroll to the bottom. Notice that Recce Cloud shows that check are not approved:
Note
Recce checks sync in realtime. However, due to the overhead of encrypting, compressing, and tranferring the State file, the sync may be slightly delayed. Ensure that you always terminate your Recce Instance on the CLI, and wait for the State to be synced. This will ensure your checks are saved to Recce Cloud.
Review the PR
As a PR author, your job is to review and approve the Checks created by the PR author. Once approved, the PR can be merged.
Note
As this tutorial uses DuckDB, which is a file-based database, the reviewer needs to have the same DuckDB file to continue the reviewers journey.
Run Recce is Review
mode
The PR reviewer should prepare their own GitHub token, but ensure to use the same password as the PR author. (The password is used to unencrypt the State file and so must be the same.)
- Checkout the PR branch
- Configured the required environment variables.
- Run Recce in
--cloud
and--review
mode.
Approve Recce Checks
When Recce loads, click the Checklist tab to review the Checks that have been prepared by the PR author.
Approve all the checks if everything looks good to you
The approval status of the check is automatically synced to Recce Cloud.
Merge the PR
Back on the GitHub PR page, you'll notice that the Recce Cloud check status has automatically been updated showing that "All checks are approved".
In a real-world situation you'd now be able to merge the PR with the confidence that the PR author had checked their work, and the reviewer both understands and has signed-off on any changes.