-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix join with sort push down #13560
base: main
Are you sure you want to change the base?
Fix join with sort push down #13560
Conversation
Thank you @haohuaijin This appears to be code that was introduced in #12559 from @berkaysynnada @berkaysynnada is there any way you can help review this PR? |
Hi @haohuaijin, and sorry for the delayed response. I have been very busy over the past few days. I have reviewed your fix and have some comments about the problem and the solution. The issue originates from Your fix refers to column names when propagating the order. However, if there are multiple columns in the child schema, this fix may still be insufficient. To address this, I suggest two potential solutions:
By the way, @alamb, isn’t it risky to have an |
Yes, I agree there is this danger I think the core challenge is that arrow Schema can have duplicated field names, but there are various places in DataFusion that assume the field name is unique (to resolve column referenes) Even more confusing is that the unique column name assumption is only in the @findepi has been observing similar issues when thinking about supported repeated column names in the logical plan |
So how shall we fix this bug? Is this PR ok to go (even if not perfect)? Or should we do something else? |
I can open a PR implementing my first suggestion as it does not require much effort. However, it will handle only hash join properly, and other operators will still go on the custom handler function, and it remains same, not being such custom actually. I'd like to hear from @haohuaijin before deciding. |
Sorry for the delay @berkaysynnada and @alamb, and thanks for the suggestion for @berkaysynnada
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks you @haohuaijin, LGTM. I have just a few minor suggestions.
The plans in the tests look improved (BTW did the previous way of the fix change the plans like that? 🤔 )
Thanks for your reviews and suggestions @berkaysynnada
the previous fix didn't change the plan. Maybe because the current solutions take into account the join type(for right join, the left side don't output) and projection, so make the plan change |
Which issue does this PR close?
Closes #13559
Rationale for this change
What changes are included in this PR?
Are these changes tested?
test by existed test and add a sqllogicaltest test
Are there any user-facing changes?