Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(dbt): divest from fqn in a subsetted execution #21857

Merged
merged 1 commit into from
May 23, 2024

Conversation

rexledesma
Copy link
Member

@rexledesma rexledesma commented May 15, 2024

Summary & Motivation

More thoughts on the line of #21851: we don't actually need
to rely on selecting by fqn, which can be very long.

Instead, we can use other markers of uniqueness, if we assume that a project is standalone. This is a reasonable assumption, considering that multi-project setups are only supported by dbt Cloud.


From https://docs.getdbt.com/reference/node-selection/methods:

Selector methods return all resources that share a common property, using the syntax method:value. While it is recommended to explicitly denote the method, you can omit it (the default value will be one of path, file or fqn).

And https://docs.getdbt.com/faqs/models/unique-model-names:

Within one project: yes! To build dependencies between models, you need to use the ref function, and pass in the model name as an argument. dbt uses that model name to uniquely resolve the ref to a specific model. As a result, these model names need to be unique, even if they are in distinct folders.

So, for models and snapshots, we can rely on running tests by their file name.


And from: https://docs.getdbt.com/reference/node-selection/test-selection-examples#more-complex-selection

Through the combination of direct and indirect selection, there are many ways to accomplish the same outcome. Let's say we have a data test named assert_total_payment_amount_is_positive that depends on a model named payments. All of the following would manage to select and execute that test specifically:

dbt test --select "assert_total_payment_amount_is_positive" # directly select the test by name

We can select tests using their test name.


After applying these optimizations when constructing the subset selection, the selection string is reduced in size by about 30-40%.

How I Tested These Changes

pytest

Copy link
Member Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @rexledesma and the rest of your teammates on Graphite Graphite

Copy link
Contributor

@cmpadden cmpadden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lgtm.

To re-iterate my understanding, path doesn't include project_name, which when repeated adds up, and you explained is only necessary in multi-project environments. Any concern with a file moving that would impact path would also impact fqn, so it's a non-issue.

Copy link
Member Author

@rexledesma rexledesma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cmpadden Yeah that's right.

cc @johannkm for your eyes as well, since this is of interest to asset checks. It doesn't solve the problem entirely, but this would give us more leeway when hitting the subprocess character limit.

I might gate this on a setting in DagsterDbtTranslator so that it doesn't unintentionally break folks who are using the current subsetted selection string successfully.

@rexledesma rexledesma force-pushed the rl/save-some-more-dbt-characters branch from 85169ec to b73a0f7 Compare May 23, 2024 13:24
@rexledesma rexledesma force-pushed the rl/save-some-more-dbt-characters branch from b73a0f7 to 539805c Compare May 23, 2024 16:57
@rexledesma rexledesma merged commit 5e5f325 into master May 23, 2024
1 check passed
@rexledesma rexledesma deleted the rl/save-some-more-dbt-characters branch May 23, 2024 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants