Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix partition mv with self joins refresh bug (backport #45876) #45936

Merged
merged 4 commits into from
May 21, 2024

Conversation

mergify[bot]
Copy link
Contributor

@mergify mergify bot commented May 20, 2024

Why I'm doing:

  • If mv contains multiple times for the same table, result may be wrong because of wrong partition pruning.
CREATE TABLE `t1` (
  `k1` int(11) NULL COMMENT "",
  `k2` decimal(38, 8) NULL COMMENT "",
  `k3` decimal(38, 8) NULL COMMENT "",
  `dt`  DATE  NULL COMMENT ""
) 
DUPLICATE KEY(`k1`)
PARTITION BY RANGE (dt) (
    START ("2023-12-31") END ("2025-01-01") EVERY (INTERVAL 1 DAY)
)
DISTRIBUTED BY HASH(`k1`) BUCKETS 3;

CREATE MATERIALIZED VIEW  test_mv1
PARTITION BY dt
DISTRIBUTED BY HASH(`k1`) BUCKETS 1
PROPERTIES (
"replication_num" = "1",
"partition_refresh_number"="-1"
) REFRESH deferred MANUAL as
select k1,k3,dt from t1;

CREATE MATERIALIZED VIEW test_mv2
PARTITION BY dt
DISTRIBUTED BY HASH(`k1`) BUCKETS 1
PROPERTIES (
"replication_num" = "1",
"partition_refresh_number"="-1"
) REFRESH MANUAL as
select t1.k1, t1.k3 t1k3, t4.k3 k4k3, t1.k3-t4.k3 as upyear, t1.dt dt 
from t1 left outer join test_mv1 t4 
on t1.k1=t4.k1 and t4.dt=substr(date_sub(t1.dt,interval dayofyear(t1.dt) day),1,10);

What I'm doing:

  • Only push down partition predicate below the table when there are no same tables in mv's defined query.
  • Partition predicate is only added into mv's defined query if it cannot be pushed below scan node.

Fixes #issue

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.3
    • 3.2
    • 3.1
    • 3.0
    • 2.5

This is an automatic backport of pull request #45876 done by [Mergify](https://mergify.com). ## Why I'm doing: - If mv contains multiple times for the same table, result may be wrong because of wrong partition pruning. ``` CREATE TABLE `t1` ( `k1` int(11) NULL COMMENT "", `k2` decimal(38, 8) NULL COMMENT "", `k3` decimal(38, 8) NULL COMMENT "", `dt` DATE NULL COMMENT "" ) DUPLICATE KEY(`k1`) PARTITION BY RANGE (dt) ( START ("2023-12-31") END ("2025-01-01") EVERY (INTERVAL 1 DAY) ) DISTRIBUTED BY HASH(`k1`) BUCKETS 3;

CREATE MATERIALIZED VIEW test_mv1
PARTITION BY dt
DISTRIBUTED BY HASH(k1) BUCKETS 1
PROPERTIES (
"replication_num" = "1",
"partition_refresh_number"="-1"
) REFRESH deferred MANUAL as
select k1,k3,dt from t1;

CREATE MATERIALIZED VIEW test_mv2
PARTITION BY dt
DISTRIBUTED BY HASH(k1) BUCKETS 1
PROPERTIES (
"replication_num" = "1",
"partition_refresh_number"="-1"
) REFRESH MANUAL as
select t1.k1, t1.k3 t1k3, t4.k3 k4k3, t1.k3-t4.k3 as upyear, t1.dt dt
from t1 left outer join test_mv1 t4
on t1.k1=t4.k1 and t4.dt=substr(date_sub(t1.dt,interval dayofyear(t1.dt) day),1,10);

## What I'm doing:
- Only push down partition predicate below the table when there are no same tables in mv's defined query.
- Partition predicate is only added into mv's defined query if it cannot be pushed below scan node.

Fixes #issue

## What type of PR is this:

- [x] BugFix
- [ ] Feature
- [ ] Enhancement
- [ ] Refactor
- [ ] UT
- [ ] Doc
- [ ] Tool

Does this PR entail a change in behavior?

- [ ] Yes, this PR will result in a change in behavior.
- [x] No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

- [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
- [ ] Parameter changes: default values, similar parameters but with different default values
- [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
- [ ] Feature removed
- [ ] Miscellaneous: upgrade & downgrade compatibility, etc.

## Checklist:

- [x] I have added test cases for my bug fix or my new feature
- [ ] This pr needs user documentation (for new or modified features or behaviors)
  - [ ] I have added documentation for my new feature or new function
- [x] This is a backport pr


Signed-off-by: shuming.li <[email protected]>
(cherry picked from commit ec7dd64)

# Conflicts:
#	fe/fe-core/src/main/java/com/starrocks/scheduler/PartitionBasedMvRefreshProcessor.java
#	fe/fe-core/src/main/java/com/starrocks/sql/optimizer/rule/transformation/materialization/MvUtils.java
#	fe/fe-core/src/test/java/com/starrocks/scheduler/PartitionBasedMvRefreshProcessorOlapTest.java
@mergify mergify bot added the conflicts label May 20, 2024
Copy link
Contributor Author

mergify bot commented May 20, 2024

Cherry-pick of ec7dd64 has failed:

On branch mergify/bp/branch-3.1/pr-45876
Your branch is up to date with 'origin/branch-3.1'.

You are currently cherry-picking commit ec7dd64d49.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	new file:   fe/fe-core/src/main/java/com/starrocks/scheduler/mv/MVPCTRefreshPlanBuilder.java
	new file:   fe/fe-core/src/test/java/com/starrocks/scheduler/PartitionBasedMvRefreshProcessorOlapPart2Test.java
	new file:   test/sql/test_materialized_view_refresh/R/test_mv_refresh_with_the_same_tables
	new file:   test/sql/test_materialized_view_refresh/T/test_mv_refresh_with_the_same_tables

Unmerged paths:
  (use "git add/rm <file>..." as appropriate to mark resolution)
	both modified:   fe/fe-core/src/main/java/com/starrocks/scheduler/PartitionBasedMvRefreshProcessor.java
	both modified:   fe/fe-core/src/main/java/com/starrocks/sql/optimizer/rule/transformation/materialization/MvUtils.java
	deleted by us:   fe/fe-core/src/test/java/com/starrocks/scheduler/PartitionBasedMvRefreshProcessorOlapTest.java

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

Copy link
Contributor Author

mergify bot commented May 20, 2024

@mergify[bot]: Backport conflict, please reslove the conflict and resubmit the pr

@mergify mergify bot closed this May 20, 2024
@mergify mergify bot deleted the mergify/bp/branch-3.1/pr-45876 branch May 20, 2024 09:37
@LiShuMing LiShuMing restored the mergify/bp/branch-3.1/pr-45876 branch May 20, 2024 11:29
@LiShuMing LiShuMing reopened this May 20, 2024
@wanpengfei-git wanpengfei-git enabled auto-merge (squash) May 20, 2024 11:29
Signed-off-by: shuming.li <[email protected]>
Signed-off-by: shuming.li <[email protected]>
Signed-off-by: shuming.li <[email protected]>
Copy link

sonarcloud bot commented May 21, 2024

@wanpengfei-git wanpengfei-git merged commit 69b3dbb into branch-3.1 May 21, 2024
29 checks passed
@wanpengfei-git wanpengfei-git deleted the mergify/bp/branch-3.1/pr-45876 branch May 21, 2024 02:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants