Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](nereids)keep at least one hash output slot when prune slots in hash join node #47318

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

starocean999
Copy link
Contributor

Problem Summary:

consider sql bellow:

SELECT
    9
FROM
    table_20_undef_partitions2_keys3_properties4_distributed_by5 AS tbl_alias2
WHERE
    (
        NOT (
            tbl_alias2.col_int_undef_signed NOT IN (
                SELECT
                    8
                FROM
                    table_50_undef_partitions2_keys3_properties4_distributed_by53
            )
            AND  '2023-12-12' IN ('2023-12-19')
        )
    );

no columns from hash join node is needed, so the hash output slots are empty. But BE would keep all columns from both table when hash output slots are empty. So FE will keep at least one column in hash output slots to let BE happy
None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Jan 22, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@starocean999
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32129 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 0601c94df7d4f329b7bcc472150fad79a3f4180d, data reload: false

------ Round 1 ----------------------------------
q1	17769	5569	5394	5394
q2	2058	328	182	182
q3	10481	1193	726	726
q4	10216	971	539	539
q5	7668	2366	2123	2123
q6	198	168	131	131
q7	893	755	601	601
q8	9243	1360	1152	1152
q9	5146	4914	4974	4914
q10	6844	2352	1862	1862
q11	475	269	266	266
q12	345	349	215	215
q13	17780	3733	3054	3054
q14	230	233	214	214
q15	537	470	459	459
q16	626	606	565	565
q17	552	858	316	316
q18	7140	6557	6437	6437
q19	2059	940	526	526
q20	307	319	191	191
q21	2762	2185	1949	1949
q22	364	331	313	313
Total cold run time: 103693 ms
Total hot run time: 32129 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6034	5535	5502	5502
q2	249	335	229	229
q3	2288	2668	2352	2352
q4	1451	1888	1399	1399
q5	4286	4762	4651	4651
q6	183	165	127	127
q7	2063	1938	1861	1861
q8	2615	2811	2680	2680
q9	7474	7285	7370	7285
q10	3040	3251	2782	2782
q11	562	512	490	490
q12	656	767	630	630
q13	3543	3942	3274	3274
q14	297	305	275	275
q15	536	487	452	452
q16	652	686	650	650
q17	1211	1735	1272	1272
q18	7719	7579	7385	7385
q19	775	1086	1075	1075
q20	2071	2042	1908	1908
q21	5718	5239	5081	5081
q22	598	592	588	588
Total cold run time: 54021 ms
Total hot run time: 51948 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 195149 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 0601c94df7d4f329b7bcc472150fad79a3f4180d, data reload: false

query1	1320	973	968	968
query2	6308	2153	2130	2130
query3	11049	4512	4540	4512
query4	61055	29166	23081	23081
query5	5515	607	459	459
query6	433	201	186	186
query7	5532	504	302	302
query8	350	253	240	240
query9	8215	2709	2698	2698
query10	458	305	261	261
query11	17384	15078	15512	15078
query12	165	116	109	109
query13	1457	560	423	423
query14	10398	7663	6983	6983
query15	216	200	194	194
query16	7189	647	485	485
query17	1132	743	600	600
query18	1883	415	304	304
query19	200	174	154	154
query20	116	110	112	110
query21	214	126	103	103
query22	4794	4890	4723	4723
query23	33874	33149	33390	33149
query24	5593	2321	2326	2321
query25	459	465	402	402
query26	638	272	149	149
query27	1700	478	333	333
query28	4334	2505	2469	2469
query29	546	564	433	433
query30	205	180	162	162
query31	930	895	845	845
query32	67	63	52	52
query33	424	363	291	291
query34	761	858	520	520
query35	815	875	786	786
query36	1006	1032	951	951
query37	129	96	79	79
query38	4358	4442	4150	4150
query39	1509	1439	1446	1439
query40	221	115	103	103
query41	52	51	48	48
query42	120	101	110	101
query43	527	534	507	507
query44	1428	828	844	828
query45	191	183	169	169
query46	923	1087	674	674
query47	1934	1942	1863	1863
query48	386	417	330	330
query49	718	495	396	396
query50	678	708	395	395
query51	7066	7027	7051	7027
query52	99	106	93	93
query53	223	268	183	183
query54	492	520	427	427
query55	81	83	80	80
query56	266	265	245	245
query57	1208	1224	1203	1203
query58	247	224	228	224
query59	3066	3219	3039	3039
query60	272	273	249	249
query61	164	117	114	114
query62	743	722	652	652
query63	218	188	188	188
query64	1292	1030	658	658
query65	3238	3182	3173	3173
query66	719	398	315	315
query67	15930	15678	15609	15609
query68	3488	854	529	529
query69	480	375	265	265
query70	1214	1154	1148	1148
query71	385	297	260	260
query72	6038	3872	3883	3872
query73	667	847	355	355
query74	10351	9019	8987	8987
query75	3227	3157	2641	2641
query76	3145	1203	807	807
query77	512	359	289	289
query78	10125	10129	9360	9360
query79	2922	797	611	611
query80	1731	525	443	443
query81	570	275	237	237
query82	351	144	119	119
query83	264	177	150	150
query84	289	93	75	75
query85	785	353	309	309
query86	455	301	313	301
query87	4537	4486	4382	4382
query88	3581	2228	2228	2228
query89	402	341	290	290
query90	1613	190	194	190
query91	132	140	113	113
query92	75	56	52	52
query93	2247	895	534	534
query94	751	381	302	302
query95	333	271	251	251
query96	487	608	287	287
query97	2796	2837	2765	2765
query98	220	199	204	199
query99	1292	1363	1271	1271
Total cold run time: 310698 ms
Total hot run time: 195149 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.53 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 0601c94df7d4f329b7bcc472150fad79a3f4180d, data reload: false

query1	0.04	0.05	0.03
query2	0.07	0.04	0.04
query3	0.24	0.07	0.06
query4	1.62	0.10	0.10
query5	0.43	0.43	0.42
query6	1.15	0.66	0.65
query7	0.02	0.02	0.02
query8	0.04	0.03	0.03
query9	0.57	0.48	0.49
query10	0.56	0.56	0.54
query11	0.14	0.10	0.10
query12	0.13	0.10	0.11
query13	0.61	0.58	0.60
query14	2.82	2.85	2.75
query15	0.91	0.82	0.82
query16	0.38	0.39	0.38
query17	0.97	1.06	1.08
query18	0.23	0.20	0.20
query19	1.83	1.82	2.00
query20	0.02	0.01	0.01
query21	15.35	0.92	0.58
query22	0.75	0.79	0.79
query23	15.18	1.39	0.59
query24	3.40	1.58	1.70
query25	0.21	0.23	0.07
query26	0.19	0.14	0.14
query27	0.04	0.05	0.04
query28	14.67	0.95	0.42
query29	12.56	3.95	3.34
query30	0.25	0.10	0.06
query31	2.82	0.59	0.38
query32	3.23	0.55	0.47
query33	2.98	3.01	3.05
query34	16.64	5.23	4.57
query35	4.51	4.52	4.55
query36	0.64	0.51	0.47
query37	0.09	0.06	0.06
query38	0.04	0.04	0.03
query39	0.03	0.02	0.03
query40	0.17	0.13	0.13
query41	0.08	0.02	0.02
query42	0.04	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 106.69 s
Total hot run time: 31.53 s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants