-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA - Inline literal in (...) statements #9528
Labels
Comments
CPU seems to be doing better than GPU here, though I'm not as experienced in reading these outputs.
|
c200chromebook
changed the title
Inline literal in (...) statements
CUDA - Inline literal in (...) statements
Apr 12, 2024
@gmarkall This one is probably yours |
sklam
added
CUDA
CUDA related issue/PR
performance - run time
Performance issue occurring at run time.
labels
Apr 15, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Feature request
We are seeing (tiny) differences in performance between
x in (1,2)
andx == 1 or x == 2
, favoring the latter. A bit of asm inspection shows that a subfunction is being created and not inlined in the first case on the CUDA target. Is it possible to automatically inline this or add support for this case toliteral_unroll
? It makes the code a bit prettier and more pythonic if you can use thein
statement.Yields
The text was updated successfully, but these errors were encountered: