Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[minor]: Introduce IndexSet and IndexMap aliases. #13611

Merged
merged 1 commit into from
Dec 4, 2024

Conversation

akurmustafa
Copy link
Contributor

Which issue does this PR close?

Closes #.

Rationale for this change

This Pr is similar to PR. It replaces all usages of indexmap::IndexSet and indexmap::IndexMap with datafusion_common::IndexSet and datafusion_common::IndexMap to enforce consistency across DataFusion.

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions physical-expr Physical Expressions optimizer Optimizer rules common Related to common crate labels Nov 30, 2024
Copy link
Contributor

@jayzhan211 jayzhan211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻

@findepi
Copy link
Member

findepi commented Dec 2, 2024

It replaces all usages of indexmap::IndexSet and indexmap::IndexMap with datafusion_common::IndexSet and datafusion_common::IndexMap to enforce consistency across DataFusion.

if some new code uses indexmap::IndexSet directly, will it cause compile-time or CI failure?

@akurmustafa
Copy link
Contributor Author

It replaces all usages of indexmap::IndexSet and indexmap::IndexMap with datafusion_common::IndexSet and datafusion_common::IndexMap to enforce consistency across DataFusion.

if some new code uses indexmap::IndexSet directly, will it cause compile-time or CI failure?

Actually, neither as far as I know. This is just a name alias.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like an improvement to me, though as @findepi notes it would be even better with more docs / automatic checking / guidance for developers

@findepi
Copy link
Member

findepi commented Dec 3, 2024

Can Clippy's disallowed types or methods be used to enforce consistency?

Otherwise, I would prefer to ensure consistency going into opposite direction.
If datafusion-common doesn't re-export IndexSet, then the code should consistently use indexmap::IndexSet as the only(?) available option.

use datafusion_expr::logical_plan::LogicalPlan;
use datafusion_expr::{Aggregate, Expr, Sort, SortExpr};
use indexmap::IndexSet;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is indexmap also removed from datafusion/optimizer/Cargo.toml?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it cannot be removed because of indexmap::Equivalent.

impl Equivalent<(Expr, Expr)> for ExprPair<'_> {
fn equivalent(&self, other: &(Expr, Expr)) -> bool {
self.0 == &other.0 && self.1 == &other.1
}
}

@alamb
Copy link
Contributor

alamb commented Dec 4, 2024

Let's keep improving on main

@alamb alamb merged commit 1aadce0 into apache:main Dec 4, 2024
27 checks passed
@jonahgao
Copy link
Member

jonahgao commented Dec 4, 2024

Can Clippy's disallowed types or methods be used to enforce consistency?

We can add them to clippy disallowed-types.

But I am a bit doubtful whether this PR is necessary because, unlike HashMap, indexmap does not have two implementations in DataFusion that need to be unified. I think indexmap = { workspace = true } already provides consistency.

In addition, this type alias has added a permanent dependency for datafusion-common and can propagate to other packages. Even if one day, common/cse.rs no longer needs IndexMap.

We also need to create a type alias for indexmap::Equivalent to ensure consistency.

@@ -93,6 +94,9 @@ pub use error::{
pub type HashMap<K, V, S = DefaultHashBuilder> = hashbrown::HashMap<K, V, S>;
pub type HashSet<T, S = DefaultHashBuilder> = hashbrown::HashSet<T, S>;

pub type IndexMap<T, S = DefaultHashBuilder> = indexmap::IndexMap<T, S>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be

pub type IndexMap<K, V, S = RandomState> = indexmap::IndexMap<K, V, S>;

@findepi
Copy link
Member

findepi commented Dec 4, 2024

I think indexmap = { workspace = true } already provides consistency.

I agree with that and also suggested in #13611 (comment)

In addition, this type alias has added a permanent dependency for datafusion-common

good point.
common is pretty common to depend on, but also is too fat already (eg pulls sqlparser dependency).
avoiding dependencies is a good thing.

@akurmustafa
Copy link
Contributor Author

I aggree with @findepi and @jonahgao that this might indeed be unnecessary. At first, this occurred to me as a natural continuation of the previous HashMap and HashSet work. However, it seems that, we have a single option anyway also without enforcing consistency, this is not helpful. Should I retract these changes?

@jonahgao
Copy link
Member

jonahgao commented Dec 5, 2024

I aggree with @findepi and @jonahgao that this might indeed be unnecessary. At first, this occurred to me as a natural continuation of the previous HashMap and HashSet work. However, it seems that, we have a single option anyway also without enforcing consistency, this is not helpful. Should I retract these changes?

Thanks @akurmustafa. I think retracting these changes will make things simpler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
common Related to common crate logical-expr Logical plan and expressions optimizer Optimizer rules physical-expr Physical Expressions sql SQL Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants