Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

treewide: reduce dependencies on boost ranges and algorithms in public headers #2459

Closed
wants to merge 11 commits into from

Conversation

avikivity
Copy link
Member

The <ranges> library can replace the use of boost algorithms and ranges. Since
most applications will have started using <range> themselves, there's no need
to burden them with the double load of both boost and std. So here we reduce
the use of boost in public headers.

No attempt is made at reducing usage of boost in non-public headers or source,
and boost libraries that have no std replacement are kept.

The return type of smp::all() is changed, but it's unlikely anyone
ever depended on it.
boost::mpl is particularly heavyweight. Replace it with fold expressions
and std::index_sequence.
Replace with <ranges> to reduce dependency load.
@avikivity avikivity force-pushed the boost-reduce-deps branch 3 times, most recently from 526d1c6 to 6c44f34 Compare September 29, 2024 21:09
}
first = false;
_features += f;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So why open code instead of use boost one until std is available? Is there a goal to get rid of boost?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is to reduce the dependency load on applications that use Seastar. If we keep this one in, a large code base has to #include boost too.

There's no goal goal to remove boost entirely - where there is no std replacement, boost remains. Here the cost/benefit is just bad.

Examples of boost libraries that will remain in use

  • containers (static_vector, small_vector)
  • intrusive
  • program_options

But for ranges, the std replacement is much better, and it's best to use only one.

Copy link
Contributor

@nyh nyh Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with the previous patch, that did leave behind boost::algorithm::join(), I think you should either get rid of it in both or leave it in both - but I would add the comment about std::views::join_with in the code, as a comment, not just in the commit message. It may take years until anyone notices again that this change can be done, and a comment will help.

And @gleb-cloudius, in my opinion the answer should definitely be yes. I can only dream of getting rid of Boost completely, but even if that will never be possible we can at least reduce its usage as much as we can. From 10 years of using Boost in our projects, the following are the problems I see with using it. Note that not all problems apply equally to every one of our uses of Boost. Sorry for the long list, I have a grudge against Boost, I guess:

  1. Many Boost header files are very heavy, slowing down our build.
  2. Boost is a grab-bag of a huge number of unrelated features developed by unrelated people. Some are very high quality, some are sadly very low quality. One of the risks of "using Boost in a project" is that developers are tempted to use random features from Boost because "we already use Boost".
  3. Using a large number of different Boost features forces Seastar developers to learn all those features to understand or modify the code. In the long run, It's unavoidable for the developers to have to learn the new official C++ standards - but it's very much avoidable to force them to also learn how Boost variants of the same features work and how they differ from the standard features they might already be familiar with.
  4. It is good form for a library like Seastar to rely on its own types and standard types, not only some third-party library types (e.g., see the smp::all change).
  5. In some cases (I don't know if boost::ranges is a case of this, but I've seen it in the past), the new standard features are more efficient - especially in compile time - because the new standard feature was co-developed with the new language standard which added language features which made super-slow old-style template metaprogramming unnecessary. So often the standard version doesn't just have a standard-looking name - it is objectively better.
  6. It is likely that as features get adopted by the C++ standard, the Boost "versions" of these features will either get officially deprecated, or just "decay" as in 10 years as nobody will want to use them. So we better stop using them before this happens.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nobody here is talking about using boot when the same functionality (with the same or better performance characteristics) is available in std. The only reason a lot of code in Scylla still use boost ranges is because clang was slow with implementing std ranges. But all other points are just NIH syndrome. Can be used against any external code.

But with this:

For consistency with the previous patch, that did leave behind boost::algorithm::join(), I think you should either > get rid of it in both or leave it in both - but I would add the comment about std::views::join_with in the code, as a > comment, not just in the commit message. It may take years until anyone notices again that this change can be > done, and a comment will help.

I fully agree and IMO boost::algorithm::join() should remain with a comment that we are waiting for the std counterpart to be available.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that we lose the benefit of the series if we leave one boost range algorithm in. Those files pull in all of their infrastructure, which then has to be compiled for that one algorithm.

If it were 20, then I wouldn't support open-coding those algorithms. But if it's one or too, it's okay.

btw I can replace the open-coding here with seastar::format and fmt::join, and as fmt is more or less a given, it doesn't add dependency load.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cool boost function will be rejected in favor of a cool std::ranges function.

Copy link
Contributor

@gleb-cloudius gleb-cloudius Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gleb-cloudius sorry if you think that 6 different points I mentioned can be summarized as "NIH syndrome".

Some were about why should we replace boost with std if possible which is a straw man. Nobody argues otherwise. I addressed that.

Most of the points are specifically about why Boost is bad.

And this is NIH talking.

But reading Avi I think what bothered him was mainly my first point - the build speed penalty for a gazillion source files just because somebody decided it will be nice to replace one 3-line loop that anybody who learned programming last year will understand by one cool Boost function.

I do not see where do you get this form. The patch still uses algorithms instead of open coded loops just from a std library. Yes, there is no point pulling a lot of includes for a single not yet available algorithm, but if the code uses 20 of them then it well worth it. std algos, that replace boost once, also pull in headers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gleb-cloudius sorry if you think that 6 different points I mentioned can be summarized as "NIH syndrome".

Some were about why should we replace boost with std if possible which is a straw man. Nobody argues otherwise. I addressed that.

Most of the points are specifically about why Boost is bad.

And this is NIH talking.

It's @nyh, not NIH. NIH would be arguing for us to implement everything outselves and not trust external libraries at all.

But reading Avi I think what bothered him was mainly my first point - the build speed penalty for a gazillion source files just because somebody decided it will be nice to replace one 3-line loop that anybody who learned programming last year will understand by one cool Boost function.

I do not see where do you get this form. The patch still uses algorithms instead of open coded loops just from a std library. Yes, there is no point pulling a lot of includes for a single not yet available algorithm, but if the code uses 20 of them then it well worth it. std algos, that replace boost once, also pull in headers.

Application code will all pull <ranges>, so we're not adding anything. When we include boost libraries, we're doubling the load.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gleb-cloudius sorry if you think that 6 different points I mentioned can be summarized as "NIH syndrome".

Some were about why should we replace boost with std if possible which is a straw man. Nobody argues otherwise. I addressed that.

Most of the points are specifically about why Boost is bad.

And this is NIH talking.

It's @nyh, not NIH. NIH would be arguing for us to implement everything outselves and not trust external libraries at all.

Not everything but anything that is not available in std, but available in boost.

But reading Avi I think what bothered him was mainly my first point - the build speed penalty for a gazillion source files just because somebody decided it will be nice to replace one 3-line loop that anybody who learned programming last year will understand by one cool Boost function.

I do not see where do you get this form. The patch still uses algorithms instead of open coded loops just from a std library. Yes, there is no point pulling a lot of includes for a single not yet available algorithm, but if the code uses 20 of them then it well worth it. std algos, that replace boost once, also pull in headers.

Application code will all pull , so we're not adding anything. When we include boost libraries, we're doubling the load.

If application pulls then we're not adding anything. If application pulls boost::algorithms we do not pull anything new as well. And until were available I would think that any application that wanted the functionality would have used it from boost.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cool boost function will be rejected in favor of a cool std::ranges function.

Yes, except one place where you replaced the cool boost function by a loop, and this is what bothered Gleb (and didn't bother me one iota, forgive me for the pun).

Reduce the depdendency load by moving the dependency from a public header
to private implementation. Those functions are heavyweight, and not
performance critical (used during negotation), so there is no performance
impact.
Replace with <any>. Added to global module fragment to make the modules
build pass.
Technically speaking the type was public, but realistically everyone
should have used the provided accessors.
@avikivity
Copy link
Member Author

v2: replaced two patches touching rpc multi compressor factory with a single patch deinlining some functions, avoiding controversial open-coding of boost::join.

@nyh
Copy link
Contributor

nyh commented Sep 30, 2024

Thanks, looks good to me (the solution of moving code into ".cc" is even better). @gleb-cloudius please confirm you don't have any more objections.

@nyh nyh closed this in f322e76 Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants