treewide: reduce dependencies on boost ranges and algorithms in public headers #2459

avikivity · 2024-09-29T18:33:12Z

The <ranges> library can replace the use of boost algorithms and ranges. Since
most applications will have started using <range> themselves, there's no need
to burden them with the double load of both boost and std. So here we reduce
the use of boost in public headers.

No attempt is made at reducing usage of boost in non-public headers or source,
and boost libraries that have no std replacement are kept.

Reduces dependency load.

The return type of smp::all() is changed, but it's unlikely anyone ever depended on it.

boost::mpl is particularly heavyweight. Replace it with fold expressions and std::index_sequence.

Replace with <ranges> to reduce dependency load.

Reduces dependency load.

gleb-cloudius · 2024-09-30T08:13:38Z

include/seastar/rpc/multi_algo_compressor_factory.hh

+            }
+            first = false;
+            _features += f;
+        }


So why open code instead of use boost one until std is available? Is there a goal to get rid of boost?

The goal is to reduce the dependency load on applications that use Seastar. If we keep this one in, a large code base has to #include boost too.

There's no goal goal to remove boost entirely - where there is no std replacement, boost remains. Here the cost/benefit is just bad.

Examples of boost libraries that will remain in use

containers (static_vector, small_vector)

intrusive

program_options

But for ranges, the std replacement is much better, and it's best to use only one.

For consistency with the previous patch, that did leave behind boost::algorithm::join(), I think you should either get rid of it in both or leave it in both - but I would add the comment about std::views::join_with in the code, as a comment, not just in the commit message. It may take years until anyone notices again that this change can be done, and a comment will help.

And @gleb-cloudius, in my opinion the answer should definitely be yes. I can only dream of getting rid of Boost completely, but even if that will never be possible we can at least reduce its usage as much as we can. From 10 years of using Boost in our projects, the following are the problems I see with using it. Note that not all problems apply equally to every one of our uses of Boost. Sorry for the long list, I have a grudge against Boost, I guess:

Many Boost header files are very heavy, slowing down our build.

Boost is a grab-bag of a huge number of unrelated features developed by unrelated people. Some are very high quality, some are sadly very low quality. One of the risks of "using Boost in a project" is that developers are tempted to use random features from Boost because "we already use Boost".

Using a large number of different Boost features forces Seastar developers to learn all those features to understand or modify the code. In the long run, It's unavoidable for the developers to have to learn the new official C++ standards - but it's very much avoidable to force them to also learn how Boost variants of the same features work and how they differ from the standard features they might already be familiar with.

It is good form for a library like Seastar to rely on its own types and standard types, not only some third-party library types (e.g., see the smp::all change).

In some cases (I don't know if boost::ranges is a case of this, but I've seen it in the past), the new standard features are more efficient - especially in compile time - because the new standard feature was co-developed with the new language standard which added language features which made super-slow old-style template metaprogramming unnecessary. So often the standard version doesn't just have a standard-looking name - it is objectively better.

It is likely that as features get adopted by the C++ standard, the Boost "versions" of these features will either get officially deprecated, or just "decay" as in 10 years as nobody will want to use them. So we better stop using them before this happens.

Nobody here is talking about using boot when the same functionality (with the same or better performance characteristics) is available in std. The only reason a lot of code in Scylla still use boost ranges is because clang was slow with implementing std ranges. But all other points are just NIH syndrome. Can be used against any external code.

But with this:

For consistency with the previous patch, that did leave behind boost::algorithm::join(), I think you should either > get rid of it in both or leave it in both - but I would add the comment about std::views::join_with in the code, as a > comment, not just in the commit message. It may take years until anyone notices again that this change can be > done, and a comment will help.

I fully agree and IMO boost::algorithm::join() should remain with a comment that we are waiting for the std counterpart to be available.

The problem is that we lose the benefit of the series if we leave one boost range algorithm in. Those files pull in all of their infrastructure, which then has to be compiled for that one algorithm.

If it were 20, then I wouldn't support open-coding those algorithms. But if it's one or too, it's okay.

btw I can replace the open-coding here with seastar::format and fmt::join, and as fmt is more or less a given, it doesn't add dependency load.

The cool boost function will be rejected in favor of a cool std::ranges function.

@gleb-cloudius sorry if you think that 6 different points I mentioned can be summarized as "NIH syndrome".

Some were about why should we replace boost with std if possible which is a straw man. Nobody argues otherwise. I addressed that.

Most of the points are specifically about why Boost is bad.

And this is NIH talking.

But reading Avi I think what bothered him was mainly my first point - the build speed penalty for a gazillion source files just because somebody decided it will be nice to replace one 3-line loop that anybody who learned programming last year will understand by one cool Boost function.

I do not see where do you get this form. The patch still uses algorithms instead of open coded loops just from a std library. Yes, there is no point pulling a lot of includes for a single not yet available algorithm, but if the code uses 20 of them then it well worth it. std algos, that replace boost once, also pull in headers.

@gleb-cloudius sorry if you think that 6 different points I mentioned can be summarized as "NIH syndrome".

Some were about why should we replace boost with std if possible which is a straw man. Nobody argues otherwise. I addressed that.

Most of the points are specifically about why Boost is bad.

And this is NIH talking.

It's @nyh, not NIH. NIH would be arguing for us to implement everything outselves and not trust external libraries at all.

But reading Avi I think what bothered him was mainly my first point - the build speed penalty for a gazillion source files just because somebody decided it will be nice to replace one 3-line loop that anybody who learned programming last year will understand by one cool Boost function.

I do not see where do you get this form. The patch still uses algorithms instead of open coded loops just from a std library. Yes, there is no point pulling a lot of includes for a single not yet available algorithm, but if the code uses 20 of them then it well worth it. std algos, that replace boost once, also pull in headers.

Application code will all pull <ranges>, so we're not adding anything. When we include boost libraries, we're doubling the load.

@gleb-cloudius sorry if you think that 6 different points I mentioned can be summarized as "NIH syndrome".

Some were about why should we replace boost with std if possible which is a straw man. Nobody argues otherwise. I addressed that.

Most of the points are specifically about why Boost is bad.

And this is NIH talking.

It's @nyh, not NIH. NIH would be arguing for us to implement everything outselves and not trust external libraries at all.

Not everything but anything that is not available in std, but available in boost.

But reading Avi I think what bothered him was mainly my first point - the build speed penalty for a gazillion source files just because somebody decided it will be nice to replace one 3-line loop that anybody who learned programming last year will understand by one cool Boost function.

I do not see where do you get this form. The patch still uses algorithms instead of open coded loops just from a std library. Yes, there is no point pulling a lot of includes for a single not yet available algorithm, but if the code uses 20 of them then it well worth it. std algos, that replace boost once, also pull in headers.

Application code will all pull , so we're not adding anything. When we include boost libraries, we're doubling the load.

If application pulls then we're not adding anything. If application pulls boost::algorithms we do not pull anything new as well. And until were available I would think that any application that wanted the functionality would have used it from boost.

The cool boost function will be rejected in favor of a cool std::ranges function.

Yes, except one place where you replaced the cool boost function by a loop, and this is what bothered Gleb (and didn't bother me one iota, forgive me for the pun).

Reduce the depdendency load by moving the dependency from a public header to private implementation. Those functions are heavyweight, and not performance critical (used during negotation), so there is no performance impact.

Reduces dependency load.

Replace with <any>. Added to global module fragment to make the modules build pass.

Technically speaking the type was public, but realistically everyone should have used the provided accessors.

avikivity · 2024-09-30T09:41:10Z

v2: replaced two patches touching rpc multi compressor factory with a single patch deinlining some functions, avoiding controversial open-coding of boost::join.

nyh · 2024-09-30T10:04:05Z

Thanks, looks good to me (the solution of moving code into ".cc" is even better). @gleb-cloudius please confirm you don't have any more objections.

avikivity added 4 commits September 29, 2024 19:41

execution_stage: remove unnecessary boost includes

efed06b

Reduces dependency load.

reactor: remove unnecessary boost includes

785a1a2

Reduces dependency load.

smp: drop dependency on boost ranges

f1145c4

The return type of smp::all() is changed, but it's unlikely anyone ever depended on it.

resource: drop unused dependency on boost::any

f9a6495

avikivity requested a review from xemul September 29, 2024 18:33

avikivity force-pushed the boost-reduce-deps branch from a3d7c2f to 5134054 Compare September 29, 2024 19:40

avikivity added 3 commits September 29, 2024 22:46

prefetch: drop dependency on boost::mpl

52b0f3a

boost::mpl is particularly heavyweight. Replace it with fold expressions and std::index_sequence.

scheduling_specific: drop dependency on boost range adaptors

9d23713

Replace with <ranges> to reduce dependency load.

sharded: replace boost ranges with <ranges>

9577f26

Reduces dependency load.

avikivity force-pushed the boost-reduce-deps branch 3 times, most recently from 526d1c6 to 6c44f34 Compare September 29, 2024 21:09

gleb-cloudius reviewed Sep 30, 2024

View reviewed changes

avikivity added 4 commits September 30, 2024 12:37

rpc: compressor factory: deinline some boost-using functions

52988ce

Reduce the depdendency load by moving the dependency from a public header to private implementation. Those functions are heavyweight, and not performance critical (used during negotation), so there is no performance impact.

rpc: drop unnecessaty includes to boost libraries

f8b4a51

Reduces dependency load.

tls: drop dependency on boost::any

a3d7893

Replace with <any>. Added to global module fragment to make the modules build pass.

rpc: rpc_types: replace boost::any with std::any

867a242

Technically speaking the type was public, but realistically everyone should have used the provided accessors.

avikivity force-pushed the boost-reduce-deps branch from 6c44f34 to 867a242 Compare September 30, 2024 09:40

nyh approved these changes Sep 30, 2024

View reviewed changes

gleb-cloudius approved these changes Sep 30, 2024

View reviewed changes

nyh closed this in f322e76 Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

treewide: reduce dependencies on boost ranges and algorithms in public headers #2459

treewide: reduce dependencies on boost ranges and algorithms in public headers #2459

avikivity commented Sep 29, 2024

gleb-cloudius Sep 30, 2024

avikivity Sep 30, 2024

nyh Sep 30, 2024 •

edited

Loading

gleb-cloudius Sep 30, 2024

avikivity Sep 30, 2024

avikivity Sep 30, 2024

gleb-cloudius Sep 30, 2024 •

edited

Loading

avikivity Sep 30, 2024

gleb-cloudius Sep 30, 2024

nyh Sep 30, 2024

avikivity commented Sep 30, 2024

nyh commented Sep 30, 2024

treewide: reduce dependencies on boost ranges and algorithms in public headers #2459

treewide: reduce dependencies on boost ranges and algorithms in public headers #2459

Conversation

avikivity commented Sep 29, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nyh Sep 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gleb-cloudius Sep 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

avikivity commented Sep 30, 2024

nyh commented Sep 30, 2024

nyh Sep 30, 2024 •

edited

Loading

gleb-cloudius Sep 30, 2024 •

edited

Loading