-
-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refresh with cascade: true - refreshing unrelated materialized views #275
Comments
Any update on this issue? |
Well tested patches are welcome |
François says: Apparently, Scenic's algorithm for retrieving dependent materialized views cannot be trusted: #275 |
* Start adding new mat views * Add BPs per month * Refactor and add specs for v2 * freeze * linting * Turn cascade off, it refreshes too many tables See scenic-views/scenic#275 * Use RefreshMaterializedViews to refresh the new matviews * Visits depend up on States, so refresh states first * States depend up on Visits, so refresh Visits first * Enforce a known good time for all these specs * Enforce end of June for UTC, ET, and IST * Freeze time to avoid intermittent failures
There's an argument this is correct, is there not? If we're refreshing I can see an argument that we should only refresh down the tree so to speak, but I think that would be more surprising to users who know they did something to change |
This is definitely a bug. |
I respectfully disagree. "cascade" usually means uni-directional, meaning down the tree (or up the tree if you prefer to look at it that way). I would consider another flag for the behavior you are describing. Perhaps there should be three: "upstream" (the current cascade), "downstream" and "both" where "both" is the behavior you are describing. |
here is code to refresh without affecting unnecessary mat views. It also includes
|
Hi,
There is a flaw in the Postgres Adapter DependencyParser class to find out which mat views are dependencies of the mat view that is being refreshed. It results in unrelated mat views being refreshed.
The reason is tsort(dependency_hash) puts all mat views in the db linearly in an array including ones that have no relation to the mat view being refreshed.
Take this example:
A depends on B which depends on C
E depends on D which depends on C
No matter what you do, A will either be before E in the array or after it:
example #1 [C,B,A,D,E]
example #2 [C,D,E,B,A]
If I refresh (cascade: true) mat view E it will also unnecessarily refresh B then A in example #1.
If I refresh (cascade: true) mat view A it will also unnecessarily refresh E then D in example #2.
TL;DR - Forcing all mat views in the db into a single linear dependency array will cause unnecessary (and costly) refreshes of unrelated mat views.
EDIT: This problem is not only about mat views having a common 'ancestry'. It also applies to situations with no common 'ancestry':
A depends on B which depends on C
F depends on E which depends on D
tsort(dependency_hash) might put all mat views in a single array that will look like the following:
[C,B,A,D,E,F]
Refreshing F with cascade:true will result in all mat views being refreshed.
The text was updated successfully, but these errors were encountered: