Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@expect() hint to optimizer #489

Open
PavelVozenilek opened this issue Sep 17, 2017 · 25 comments · May be fixed by #19658
Open

@expect() hint to optimizer #489

PavelVozenilek opened this issue Sep 17, 2017 · 25 comments · May be fixed by #19658
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@PavelVozenilek
Copy link

PavelVozenilek commented Sep 17, 2017

Accepted Proposal


This is proposal for a small feature, local and isolated. Could improve code readability.


Linux kernel uses macros likely and unlikely :

if (likely(x > 0)) { ... }
...
if (unlikely(y == NULL)) { ... }

These macros may improve performance a little bit and they serve as handy documentation.

Language Nim supports this functionality too, through its standard library ( https://nim-lang.org/docs/system.html#likely.t,bool ).


My proposal:

Add if+ and if- into the language. if+ would signal that the path is rather likely, if- will suggest the flow will not go this way.

if+ (x < 0) { 
  ...  // likely goes here
}
if- (err) { 
  ... // unlikely
}

What is this good for:

  1. It gives hint to source code reader, hint which will be almost never found in the documentation. This hint is very short, intuitive, and doesn't require extra pair of parenthesis.
  2. It may slightly improve the performance, by rearranging instructions so that the likely path goes w/o jump, or by inserting hint for branch selector.

How it could be implemented:

  1. By doing nothing, just using it as documentation only feature.
  2. By using IR opcode llvm.expect. This is how clang implements __builtin_expect, which is called by likely/unlikely macros. (GCC also provides __builtin_expect, MSVC has nothing similar.)

Performance effects of __builtin_expect are hotly disputed on the internet. Many people claim programmers are invariably bad in such prediction and profile guided optimization will do much, much better. However, they never support their claim by benchmarks.

Even if performance is unaffected the documentation value remains. When one is stepping the code through debugger and the flow goes against the hint, one may be get more cautious a catch a bug.

@raulgrell
Copy link
Contributor

raulgrell commented Sep 17, 2017

I do think that something to guide branch prediction can be pretty neat, but might be clearer with a builtin like @expect(expression, value) or @likely(condition).

This would be less surprising to someone who's familiar with llvm.expect and gcc __builtin_expect. A builtin might also be better suited as it communicates a hint to the compiler as opposed to actual program logic.

if (@expect(x < 0, false)) {
    //
}

@PavelVozenilek
Copy link
Author

I use likely/unlikely in my code and the additional parenthesis make have impact on readability. Alternative syntax could be something as


if (x > 0) {
  @likely
   ...
} else {
  ...
}

but this wastes one precious line.

@andrewrk andrewrk added this to the 0.2.0 milestone Sep 17, 2017
@andrewrk andrewrk added the enhancement Solving this issue will likely involve adding new logic or components to the codebase. label Sep 17, 2017
@raulgrell
Copy link
Contributor

raulgrell commented Sep 17, 2017

Sure, it wastes a line, but it makes it abundantly clear that the first branch is the likely one. It is important to note that if you choose the wrong branch, performance will be worse and it is actually good for the hint to be easily found. More information on performance: http://blog.man7.org/2012/10/how-much-do-builtinexpect-likely-and.html

More explicitly/consistent with the language, perhaps:

if (x > 0) {
  @expectedBranch(this);
   ...
} else {
  ...
}

@andrewrk
Copy link
Member

It's going to be @expect as described by @raulgrell. This avoids adding new syntax and closely matches the LLVM intrinsic.

@tiehuis tiehuis added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Sep 18, 2017
@thejoshwolfe thejoshwolfe changed the title if+/if- if+/if- or @expect() Sep 19, 2017
@andrewrk andrewrk modified the milestones: 0.2.0, 0.3.0 Oct 19, 2017
@andrewrk andrewrk changed the title if+/if- or @expect() @expect() hint to optimzer Dec 3, 2017
@andrewrk andrewrk added the accepted This proposal is planned. label Dec 3, 2017
@lanior
Copy link

lanior commented Feb 24, 2018

Compare

if (@expect(x < 0, false))

with

#define likely(x)   (__builtin_expect(!!(x), 1))
#define unlikely(x) (__builtin_expect(!!(x), 0))

if unlikely(x < 0)
   ...

The former is too chatty. If you are going to leave two sets of parenthesis, please consider adding @likely and @unlikely because people are used to it. Almost nobody uses two argument intrinsic directly.

@thejoshwolfe
Copy link
Sponsor Contributor

How would I expect no error from something?

if (errorable()) |payload| {
    // this should be likely
} else |err| {
    // this should be unlikely
}

The proposed @expect(expression, value) builtin looks like it needs to be able to effectively do == comparison, which doesn't very well cover the spectrum of possible control flow in zig.

Would there be any way to expect or not expect certain branches of a switch? How about the else of a while? How about the error path in a try, catch, or errdefer?

Glancing at the LLVM docs, it looks like we're pretty limited in what we can do. Looks like @expect() is the best we can do for now. I think a more generally useful feature is the ability to annotate an IR basic block to be likely or unlikely, then have zig builtins that you state at the beginning of a block, like @expectedBranch(this); that @raulgrell proposed.

The former is too chatty.

I think it's ok to be a little verbose with this feature, since it's a bit advanced and not recommended unless you understand the drawbacks. My concern is lack of generality.

@andrewrk
Copy link
Member

Zig would always expect no error. That's #84

@0joshuaolson1
Copy link

Would this make sense on other conditional Zig operators?

@shawnl
Copy link
Contributor

shawnl commented Aug 7, 2018

Would this make sense on other conditional Zig operators?

Yes, and for that reason likely/unlikely is prob. Better.

@PavelVozenilek
Copy link
Author

PavelVozenilek commented Aug 7, 2018

@0joshuaolson1: it probably doesn't make sense to expand the feature. It has to be used often to have an impact. if+/if- has low typing overhead and is also acts as intuitive self-documentation (the main advantage, I would say). Trying to shoehorn it elsewhere would require clumsy syntax.

Language Nim has support for fine tuned switch, using linearScanEnd pragma ( https://nim-lang.org/docs/manual.html#pragmas-linearscanend-pragma ), but I think this is overkill

@BarabasGitHub
Copy link
Contributor

Why does it have to be used often to have an impact? I'd say you'll only have to use it in hot loops and stuff like that. It really doesn't matter most of the time.

@PavelVozenilek
Copy link
Author

@BarabasGitHub: the hint allows to reorder instructions so that the likely flow goes without jumping (and this cleaning the pipeline). This can save few cycles. To have measurable impact, it should be applied a lot.

I value it more as the documentation. VC++ does not support implementation of likely/unlikely like GCC does, but I use it anyway, as empty macro.

@BarabasGitHub
Copy link
Contributor

BarabasGitHub commented Aug 8, 2018 via email

@PavelVozenilek
Copy link
Author

PavelVozenilek commented Aug 8, 2018

In theory, with a very advanced benchmarking tool (could be based on #1010#issuecomment-389227431), validity of these hints could be checked and clear violation reported. This may help the programmer to discover wrong assumptions about runtime behaviour.

If the major reason for if+/if- is self-documentation, then it makes sense to use it often.

@andrewrk
Copy link
Member

andrewrk commented Aug 8, 2018

In theory, with a very advanced benchmarking tool validity of these hints could be checked and clear violation reported.

That sounds like #237

@andrewrk andrewrk removed the enhancement Solving this issue will likely involve adding new logic or components to the codebase. label Nov 21, 2018
@andrewrk andrewrk modified the milestones: 0.4.0, 0.5.0 Nov 21, 2018
@andrewrk
Copy link
Member

andrewrk commented Jul 5, 2019

Related question, should it be called @asExpected instead?

For a long time I've hated the convention of defining likely() and unlikely() macros for __builtin_expect, because they read so wrong in English ("if this condition is likely...")
So a while back I was thinking about how to replace them in a way that reads naturally: as_expected() and unexpectedly(). "If, as expected, ..." and "If, unexpectedly, ..."

https://twitter.com/RichFelker/status/1146906341417594887

@andrewrk
Copy link
Member

Counter proposal: #5177

@daurnimator
Copy link
Collaborator

daurnimator commented Jul 16, 2020

The builtin should probably more closely resemble the LLVM ‘llvm.expect.with.probability’ Intrinsic in that it also takes a probability of how likely the value is.

@andrewrk
Copy link
Member

Both @cold (#5177) and this are accepted.

@expect(value, comptime expected_value: @TypeOf(value), comptime probability: f32) @TypeOf(value)

Returns value. value may be any type. probability is a value between 0 and 1.0, inclusive.

how to expect a certain switch prong
how to expect a certain branch for if-optionals
how to override the default and expect a certain branch for if-error-unions
how to expect a certain branch for while-bool, while-optional, while-error-union

These are all solved with #5177 which is also accepted.

@andrewrk andrewrk added accepted This proposal is planned. and removed optimization labels Apr 22, 2021
@andrewrk andrewrk modified the milestones: 0.8.0, 0.9.0 Apr 22, 2021
@andrewrk andrewrk modified the milestones: 0.9.0, 0.10.0 May 19, 2021
@johan-bolmsjo
Copy link
Sponsor Contributor

I want to add that apart from the mentioned benefits, readability and static hints for branch prediction it also reduce the pressure on the L1 instruction cache by typically moving cold instructions last in functions. I believe this to be the greatest benefit of using a likely/unlikely construct. For the longest time I was not a believer but I've seen the performance benefits of using this in a real product that was optimized a lot over its life time. Eventually we resorted to pepper the hot code paths with likely and unlikely. Ugly, but it can be effective.

@andrewrk andrewrk changed the title @expect() hint to optimzer @expect() hint to optimizer May 19, 2023
@Snektron
Copy link
Collaborator

Snektron commented Oct 15, 2023

how to expect a certain switch prong

switch (@expect(123, x, 1)) {
  123 => { 
    // likely
  },
  124, else => {
    // unlikely
  },
}

Here its still impossible to order branches in likelyness. This could be handled with an if/else chain I guess.

@mlugg
Copy link
Member

mlugg commented Oct 15, 2023

Here it's still impossible to order branches in likeliness.

@expect(123, @expect(124, x, 0.3), 0.7) perhaps? That said, the arg order makes this a bit weird. Perhaps it'd be better if it were @expect(123, 0.7, @expect(124, 0.3, x)). Or maybe even if @expect took a whole map of probabilities in some form, but maybe that's more complicated of a form than we want in a builtin.

Also, regarding the natural language point made earlier in this isue: perhaps @withExpectation would be a better name? "expect" is definitely better than "likely", but still reads a little weird.

@Rexicon226 Rexicon226 linked a pull request Apr 15, 2024 that will close this issue
@jacobly0
Copy link
Member

jacobly0 commented May 7, 2024

I realize that this accepted definition of @expect matches an llvm intrinsic, but I do not believe that this is the most helpful tool for the programmer, for self-hosted backends, or for automated addition of profiling annotations to the source code.

As a programmer, I care less about the expected value and more about the expected branch, I want go to the part of the source code that contains the code path that I think is important and annotate that. Also, raw probabilities are problematic for keeping them consistent while editing code, what I really want are weights for each branch with a default of 1 for unannotated branches. Note that this doesn't prevent using weights that add up to some number like 100 and that translate directly to probabilities.

For self-hosted backends, assuming no higher level optimizations, the best you can do for an if branch is make the machine code likely branch (with the x86_64 branch predictor, not taken for conditional forward branches and taken for conditional backwards branches) match a source code annotation. This is certainly possible with the accepted @expect definition, but requires either Sema to shuffle around the information to be more useful (basically an optimization pass at that point) or the backend to go through unnecessarily convoluted value tracking.

One optimization that can be done for switches on x86_64 is make the most likely prong a fake "fallthrough" from the indirect branch (which the x86_64 branch predictor assumes as the likely target). This is also compatible with the accepted definition of @expect. Additionally, we should not be take llvm's codegen as gospel and the best we can hope to accomplish. For example, we already expect to have to generate custom jump tables to codegen labeled continue. So there's no reason to expect that a self-hosted backend wouldn't use the same or similar logic. However, for switches that are not amenable to jump tables due to non-contiguity of prong items or due to overly large prong item ranges, we are in the realm of deciding how to order the value comparisons, for which we need more information for each prong to be able to make informed decisions. This requires something closer to what was more helpful for the programmer above.

For automated tooling, a profiler is going to have branch counts for each branch, and so it would be trivial to just edit the source code with an annotation for the branch count of each branch, which is just a weight as described above.

For a syntax proposal, in the interest of avoiding extra grammar complexity, I think there can just be a @branchWeight builtin or similar that applies a weight to the enclosing scope, just like many other builtins (@setEvalBranchQuota, @setFloatMode, @setRuntimeSafety). This would look like:

switch (x) {
    1 => { // likely
        @branchWeight(100);
    },
    2 => {}, // unlikely, default weight of 1
    // it seems like this branch should have a lower weight than the default
    // maybe a weight of 0, or floats weights < 1 could mean "cold" or "never optimize for this branch being taken"
    // in the future, branches that always trigger safety, panic, or error could be detected and treated like that
    else => unreachable,
}
while (true) {
    if (normal_term_cond) {
        @branchWeight(10); // slightly unlikely termination condition
        break;
    } else {
        @branchWeight(1_000); // very likely to keep looping
    }
    if (special_case) {
        @branchWeight(1); // very unlikely termination condition
        break;
    } else {
        @branchWeight(1_000); // very likely to keep looping
    }
}

@silversquirl
Copy link
Contributor

Adding to jacobly's points, I also wonder whether @setCold could be merged with this somehow, as they fill similar purposes (at least from the perspective of the programmer). Perhaps a negative or zero @branchWeight at the top level of a function could replace it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

Successfully merging a pull request may close this issue.