-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor MPI serialization #689
Conversation
@kks32 That would be nice to see some performance comparison of using the serialize and deserialize vs the normal hdf5, just to make sure there is no performance reduction. Also, can we check it for different numbers of MPI rank? |
We won't have a big difference in the amount of information that is being sent/received. Furthermore, it would be hard to measure any significant speed difference in MPI transfer unless we do 100s of nodes with millions of particles, even if we do that I don't think it will be a big difference considering since the data size change is very small. However, as previously mentioned in the RFC, the time to serialize/deserialize particles as PODs or vector of
|
Codecov Report
@@ Coverage Diff @@
## develop #689 +/- ##
===========================================
- Coverage 96.81% 96.69% -0.11%
===========================================
Files 131 130 -1
Lines 25811 25822 +11
===========================================
- Hits 24987 24968 -19
- Misses 824 854 +30
Continue to review full report at Codecov.
|
@bodhinandach or @tianchiTJ or @jgiven100 would you be able to test the MPI scheme with a material model that has state variables (NorSand or MC)? Check with load balancing or any problem that involves migration of particles. |
I test it by MC model, and I think the result is good. |
@kks32 NorSand test looks good |
Thanks @jgiven100 and @tianchiTJ for testing with state vars materials |
@kks32, I would like to understand the data being presented.
What is |
POD alone is insufficient as you need to register data with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the awesome refactoring idea @kks32. Some comments from my side, I am working on the twophase and fluid particle
Co-authored-by: Bodhinanda Chandra <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MPM Particle serialization
Summary
Motivation
Design Detail
The
Particle
class will have aserialize
and adeserialize
function both using avector<uint8_t>
as the buffer. In addition, we need to compute the pack size to initialize the buffer with the correct size. This is saved as a private variable.The deserialization function will read from the buffer.
Important consideration: We expect all future derivation of particle types to have the first few bytes to be the
Type
of particle followed bymaterial
information to retrieve in the mesh class for initialization of particle and subsequent deserialization.In addition, the particle type is added to the
Particle
class.This is used to identify the type of particle and create them when they are transferred across MPI tasks. Moreover, we have added
ParticleType
andParticleTypeString
as global maps to determine an index value (int) mapped to a string "P2D". The reason for this is that in serialization if we use string, we have no idea of the length of the string, which makes it complicated. Instead, since we are only going to have a few different particle types, it's easier to set-up a map to do a quick lookup.The MPI
transfer_halo_particles
will be altered to send 1 particle at a time rather than a bulk of particles. This is to achieve sending different particle types in a cell all at once (sequentially), instead of iterating through each particle type.These changes would remove the need for registering MPI particle types and also get us one more step closer to removing the limit of 20 on the
state_vars
.Drawbacks
No potential drawback has been identified.
Rationale and Alternatives
Serialization vs MPI_Type_Create_Struct speed is unknown, we may have to do a performance benchmark to see the difference. Using Struct data types means we have to register each particle type and has a fixed number of state variables.
Different serialization libraries were considered: (Boost Serialization)[https://www.boost.org/doc/libs/1_56_0/libs/serialization/doc/tutorial.html], (Cereal)[http://uscilab.github.io/cereal/], and (bitsery)[https://github.com/fraillt/bitsery]. The fastest bitsery, doesn't have serialization support for Eigen, we can implement a custom serializer but it will take some time. The MPI Pack/Unpack seems to be one of the fastest
Size
Time
If not done, will result in clunkier interface for handling different MPI transfer.
Prior Art
Discuss prior art, both the good and the bad, in relation to this proposal. A few examples of what this can include are:
https://github.com/STEllAR-GROUP/cpp-serializers
https://github.com/fraillt/bitsery#why-use-bitsery
Unresolved questions
MPI transfer halo particles function is yet to be implemented. Don't foresee an issue, but still TBD.
#680
#681
Changelog