Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider improving FrozenSet for byte/char #110783

Open
MihaZupan opened this issue Dec 17, 2024 · 2 comments · May be fixed by #110842
Open

Consider improving FrozenSet for byte/char #110783

MihaZupan opened this issue Dec 17, 2024 · 2 comments · May be fixed by #110842
Assignees
Labels
area-System.Collections in-pr There is an active PR which will close this issue when it is merged tenet-performance Performance related issue
Milestone

Comments

@MihaZupan
Copy link
Member

MihaZupan commented Dec 17, 2024

FrozenSet<T> is the go-to option if you have an arbitary set of values and just need to do a bunch of Contains(T) queries.
Now that SearchValues<T> is a thing, I've seen the question posed a few times of whether one should use that instead, as it also has a Contains(T).

Currently for the best performance you'd want to use FrozenSet for everything except byte/char, where SearchValues will do a better job.
SearchValues<string>.Contains on the other hand is strictly worse as we're just backing it with a HashSet<string>.

We should consider improving FrozenSet to be as good as SearchValues for these two types to make the decision/recommendations easy.

(part of the issue is also that existing docs on SearchValues don't make it obvious that the point of the type is to use it with extension methods on spans)


In practice the interesting cases here IMO are:

  • using a bitmap when all the values are ASCII
  • consider borrowing the perfect-hash O(1) lookup that doesn't need to walk over entries to confirm matches.
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Dec 17, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-collections
See info in area-owners.md if you want to be subscribed.

@eiriktsarpalis eiriktsarpalis added tenet-performance Performance related issue and removed untriaged New issue has not been triaged by the area owner labels Dec 17, 2024
@eiriktsarpalis eiriktsarpalis added this to the Future milestone Dec 17, 2024
@eiriktsarpalis
Copy link
Member

@MihaZupan I put this in the future milestone for now. Feel free to promote to 10 if you plan to work on this yourself.

@MihaZupan MihaZupan self-assigned this Dec 19, 2024
@MihaZupan MihaZupan modified the milestones: Future, 10.0.0 Dec 19, 2024
@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Collections in-pr There is an active PR which will close this issue when it is merged tenet-performance Performance related issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants