-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make vectors trivially relocatable. #337
Open
sukeya
wants to merge
14
commits into
AcademySoftwareFoundation:main
Choose a base branch
from
sukeya:make_vecs_trivially_relocatable
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+358
−5
Open
Changes from 2 commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
69b4559
Make vectors trivially relocatable.
sukeya 56f83c3
Keep both ways of having default ctor or not.
sukeya abc7c5e
Fix using default ctor whatever value the flag is.
sukeya 04950f4
Revert "Fix using default ctor whatever value the flag is."
sukeya 9db4019
Fix using a build flag to use a macro.
sukeya c49dc99
[wip] add a test for ImathVec when using default.
sukeya 5d4ed11
Remove Imath::Config from ImathTestVecUseDefaultConstructor.
sukeya 63478fc
Revert "Remove Imath::Config from ImathTestVecUseDefaultConstructor."
sukeya 536b430
Remove Imath::Config from ImathTestVecUseDefaultConstructor.
sukeya 961ac87
Fix assertion errors and clean up.
sukeya 03f055f
Update src/Imath/ImathVec.h
sukeya 64674d1
Updata src/Imath/ImathVec.h
sukeya 7ad5437
Update src/Imath/ImathVec.h
sukeya 386faf1
Merge remote-tracking branch 'origin/main' into make_vecs_trivially_r…
sukeya File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think instead we want
Because it's easy to imagine that a particular downstream app needs to make sure it's the other way, and this would allow it to be overridden if the downstream modules defines the symbol as 0 before including any imath headers.
Then the various places where you have
should be
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fly in the ointment, if we go both ways, are we inviting an recurring support burden along these lines?
I can't make my build work with MegaDCC",
oh right, did you match IMATH_VEC_USE_DEFAULT_CONSTRUCTOR to what they picked?
but that's in conflict to our internal build
yeah but did you namespace?
we don't want to have to have two copies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The OPENEXR_DLL thing was sort of a harbinger, I'm worried about a matrix of macros people have to solve for compatibility
or maybe I'm misunderstanding completely, and it's all inlined and completely doesn't matter which you pick
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think because it's inlined it's all ok? There shouldn't be any link issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's the case, great. Then my only request is that we document why we have it both ways, even if it is a copy and paste of @AlexMWells thorough notes above. This is strongly in the realm of "gosh, this is rather esoteric, why is this a choice, and how do I make it?" I'd prefer the doc is inline in the CMakeLists.txt where you are faced with making a choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An example, well with an older compiler, but just wanted to show the behavior.
https://godbolt.org/z/rEY4Ef5GM
With
#define VEC_ENABLE_DEFAULT_CONSTRUCTOR 1
The optimization report shows:
Structure assignment was serialized really kills performance, and we can see the compiler by "default" decided to generate a memcpy.
By providing the copy constructor and assignment operator explicitly we avoid the memcpy or memset (for initializations) and with #define VEC_ENABLE_DEFAULT_CONSTRUCTOR 0
The optimization report shows:
Now this is all very dependent on compiler and order of optimizations, and perhaps isn't as required on newer compilers. But this is just a simple example, real world kernels could have much larger state being kept on the stack and any nested aggregate that happens to use a memcpy or memset by default could prevent the SROA pass from fully succeeding.
Thus it would be nice for projects like OSL to choose to disable the use of default constructors. And if the supported compiler matrix has moved on to where it is not a problem then it could chose to enable them. Problem is examining optimization reports before and after is required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that sounds like explicitly defining copy and assignment, and not using default, we get the behavior that we want, which is that things optimize as we wish. "we" includes OSL. I can't think of a reason to provide this as one option, and default constructors and copy another option.
I am rereading this argument which suggests maybe we need the option,
#337 (comment)
but I'm not understanding a case to preserve the default option. Sorry if I'm being dense....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why I made this PR was because I wanted to copy
Vec3
sequences between host and device usingthrust::async::copy
. It requiresVec3
to be trivially relocatable, so default constructors and assignment operators for copy and move are needed for me.I'm sorry for not telling it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My apologies, I didn't mean to shut down your effort. Maybe I am misunderstanding? I thought that adding the explicit copy and move gives us the trivial relocatability you are searching for, and also satisfies the SROA concern that @AlexMWells raises? I thought we can solve both your need, and also the SIMD concern at the same time, without needing to have a special conditional. My intent wasn't to stop your effort, I was hoping we can find one solution that solves both problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first, I apologize for my quick but rough decision.
I misunderstood I couldn't solve the assertion error, but I noticed that it was caused by ODR violation. Though I'm running CI in my forked repository, it will be OK.
I think that the trivial relocatability and the SROA are incompatible because the trivial relocatability requires
Vec3
not to have user-provided copy constructor and assignment operator(even thoughVec3::Vec3(const Vec3& v) : x(v.x), y(v.y), z(v.z) {}
), but on the other hand, providing the copy constructor and assignment operator increases the performance as @AlexMWells showed earlier.So, we have to switch
default
or not in some way.Thank you :)