Refactor MPI serialization #689

kks32 · 2020-08-14T21:55:13Z

MPM Particle serialization

Summary

Add functionality to handle particle serialization and deserialization to transfer particles across MPI tasks.

Motivation

The existing design uses Plain-Old-Data (POD) structure, where Particle class writes its data to HDF5Particle with all the relevant information. This POD is then serialized and sent using MPI with MPI_Type_Create_Struct. This requires registering all the different particle types and makes it harder to implement when more than 1 particle type is involved. The serialization/deserialization function offers a unified interface to transfer particles.

Design Detail

The Particle class will have a serialize and a deserialize function both using a vector<uint8_t> as the buffer. In addition, we need to compute the pack size to initialize the buffer with the correct size. This is saved as a private variable.

//! Serialize particle data
template <unsigned Tdim>
std::vector<uint8_t> mpm::Particle<Tdim>::serialize() {
  // Compute pack size
  if (pack_size_ == 0) pack_size_ = compute_pack_size();
  // Initialize data buffer
  std::vector<uint8_t> data;
  data.resize(pack_size_);
  uint8_t* data_ptr = &data[0];
  int position = 0;

#ifdef USE_MPI
  // Type
  int type = ParticleType.at(this->type());
  MPI_Pack(&type, 1, MPI_INT, data_ptr, data.size(), &position, MPI_COMM_WORLD);

  // Material id
  unsigned nmaterials = material_id_.size();
  MPI_Pack(&nmaterials, 1, MPI_UNSIGNED, data_ptr, data.size(), &position,
           MPI_COMM_WORLD);
  MPI_Pack(&material_id_[0], 1, MPI_UNSIGNED, data_ptr, data.size(), &position,
           MPI_COMM_WORLD);

  // ID
  MPI_Pack(&id_, 1, MPI_UNSIGNED_LONG_LONG, data_ptr, data.size(), &position,
           MPI_COMM_WORLD);
  // Mass
  MPI_Pack(&mass_, 1, MPI_DOUBLE, data_ptr, data.size(), &position,
           MPI_COMM_WORLD);
  // Volume
  MPI_Pack(&volume_, 1, MPI_DOUBLE, data_ptr, data.size(), &position,
           MPI_COMM_WORLD);
#endif
}

The deserialization function will read from the buffer.

//! Deserialize particle data
template <unsigned Tdim>
void mpm::Particle<Tdim>::deserialize(
    const std::vector<uint8_t>& data,
    std::vector<std::shared_ptr<mpm::Material<Tdim>>>& materials) {
  uint8_t* data_ptr = const_cast<uint8_t*>(&data[0]);
  int position = 0;

#ifdef USE_MPI
  // Type
  int type = ParticleType.at(this->type());
  MPI_Unpack(data_ptr, data.size(), &position, &type, 1, MPI_INT,
             MPI_COMM_WORLD);
  // material id
  int nmaterials = 0;
  MPI_Unpack(data_ptr, data.size(), &position, &nmaterials, 1, MPI_UNSIGNED,
             MPI_COMM_WORLD);

  MPI_Unpack(data_ptr, data.size(), &position, &material_id_[0], 1,
             MPI_UNSIGNED, MPI_COMM_WORLD);

  // ID
  MPI_Unpack(data_ptr, data.size(), &position, &id_, 1, MPI_UNSIGNED_LONG_LONG,
             MPI_COMM_WORLD);
  // mass
  MPI_Unpack(data_ptr, data.size(), &position, &mass_, 1, MPI_DOUBLE,
             MPI_COMM_WORLD);
  // volume
  MPI_Unpack(data_ptr, data.size(), &position, &volume_, 1, MPI_DOUBLE,
             MPI_COMM_WORLD);

#endif
}

Important consideration: We expect all future derivation of particle types to have the first few bytes to be the Type of particle followed by material information to retrieve in the mesh class for initialization of particle and subsequent deserialization.

In addition, the particle type is added to the Particle class.

  //! Type of particle
  std::string type() const override { return (Tdim == 2) ? "P2D" : "P3D"; }

This is used to identify the type of particle and create them when they are transferred across MPI tasks. Moreover, we have added ParticleType and ParticleTypeString as global maps to determine an index value (int) mapped to a string "P2D". The reason for this is that in serialization if we use string, we have no idea of the length of the string, which makes it complicated. Instead, since we are only going to have a few different particle types, it's easier to set-up a map to do a quick lookup.

>particle.cc
namespace mpm {
// ParticleType
std::map<std::string, int> ParticleType = {{"P2D", 0}, {"P3D", 1}};
std::map<int, std::string> ParticleTypeName = {{0, "P2D"}, {1, "P3D"}};
}  // namespace mpm

The MPI transfer_halo_particles will be altered to send 1 particle at a time rather than a bulk of particles. This is to achieve sending different particle types in a cell all at once (sequentially), instead of iterating through each particle type.

These changes would remove the need for registering MPI particle types and also get us one more step closer to removing the limit of 20 on the state_vars.

Drawbacks

No potential drawback has been identified.

Rationale and Alternatives

Why is this design the best in the space of possible designs?

Serialization vs MPI_Type_Create_Struct speed is unknown, we may have to do a performance benchmark to see the difference. Using Struct data types means we have to register each particle type and has a fixed number of state variables.

What other designs have been considered and what is the rationale for not choosing them?

Different serialization libraries were considered: (Boost Serialization)[https://www.boost.org/doc/libs/1_56_0/libs/serialization/doc/tutorial.html], (Cereal)[http://uscilab.github.io/cereal/], and (bitsery)[https://github.com/fraillt/bitsery]. The fastest bitsery, doesn't have serialization support for Eigen, we can implement a custom serializer but it will take some time. The MPI Pack/Unpack seems to be one of the fastest

Size

Time

What is the impact of not doing this?

If not done, will result in clunkier interface for handling different MPI transfer.

Prior Art

Discuss prior art, both the good and the bad, in relation to this proposal. A few examples of what this can include are:

https://github.com/STEllAR-GROUP/cpp-serializers

https://github.com/fraillt/bitsery#why-use-bitsery

Unresolved questions

What parts of the design do you expect to resolve through the RFC process before this gets merged?

MPI transfer halo particles function is yet to be implemented. Don't foresee an issue, but still TBD.

Related issues

#680
#681

Changelog

bodhinandach · 2020-08-15T05:19:21Z

@kks32 That would be nice to see some performance comparison of using the serialize and deserialize vs the normal hdf5, just to make sure there is no performance reduction. Also, can we check it for different numbers of MPI rank?

kks32 · 2020-08-17T14:19:50Z

We won't have a big difference in the amount of information that is being sent/received. Furthermore, it would be hard to measure any significant speed difference in MPI transfer unless we do 100s of nodes with millions of particles, even if we do that I don't think it will be a big difference considering since the data size change is very small. However, as previously mentioned in the RFC, the time to serialize/deserialize particles as PODs or vector of uint_8t is benchmarked and the results show the serialization with unit_8t is faster than POD+MPI_Type_Create_Struct. Serialization / Deserialization of a POD in itself is faster, however, registering and deregistering the MPI Data types more time than serializating/deserializing as vector of unsigned buffer.

SECTION("Performance benchmarks") {
      // Number of iterations
      unsigned niterations = 1000;

      // Serialization benchmarks
      auto serialize_start = std::chrono::steady_clock::now();
      for (unsigned i = 0; i < niterations; ++i) {
        // Serialize particle
        auto buffer = particle->serialize();
        // Deserialize particle
        std::shared_ptr<mpm::ParticleBase<Dim>> rparticle =
            std::make_shared<mpm::Particle<Dim>>(id, pcoords);

        REQUIRE_NOTHROW(rparticle->deserialize(buffer, materials));
      }
      auto serialize_end = std::chrono::steady_clock::now();

      // HDF5 serialization
      auto hdf5_start = std::chrono::steady_clock::now();
      for (unsigned i = 0; i < niterations; ++i) {
        // Serialize particle as POD
        auto hdf5 = particle->hdf5();
        // Deserialize particle with POD
        std::shared_ptr<mpm::ParticleBase<Dim>> rparticle =
            std::make_shared<mpm::Particle<Dim>>(id, pcoords);
        // Initialize MPI datatypes
        MPI_Datatype particle_type = mpm::register_mpi_particle_type(hdf5);
        REQUIRE_NOTHROW(rparticle->initialise_particle(hdf5, material));
        mpm::deregister_mpi_particle_type(particle_type);
      }
      auto hdf5_end = std::chrono::steady_clock::now();
}

codecov · 2020-08-17T14:37:55Z

Codecov Report

Merging #689 into develop will decrease coverage by 0.11%.
The diff coverage is 67.70%.

@@             Coverage Diff             @@
##           develop     #689      +/-   ##
===========================================
- Coverage    96.81%   96.69%   -0.11%     
===========================================
  Files          131      130       -1     
  Lines        25811    25822      +11     
===========================================
- Hits         24987    24968      -19     
- Misses         824      854      +30

Impacted Files	Coverage Δ
include/mesh.h	`100.00% <ø> (ø)`
include/mesh.tcc	`82.65% <0.00%> (-1.48%)`	⬇️
include/particles/particle_base.h	`100.00% <ø> (ø)`
tests/graph_test.cc	`100.00% <ø> (ø)`
include/particles/particle.tcc	`91.92% <82.69%> (-2.01%)`	⬇️
include/particles/particle.h	`100.00% <100.00%> (ø)`
include/solvers/mpm_explicit.tcc	`95.16% <100.00%> (+0.08%)`	⬆️
tests/particle_serialize_deserialize_test.cc	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1eaa7a4...6c1aef8. Read the comment docs.

kks32 · 2020-08-17T23:37:16Z

@bodhinandach or @tianchiTJ or @jgiven100 would you be able to test the MPI scheme with a material model that has state variables (NorSand or MC)? Check with load balancing or any problem that involves migration of particles.

include/particles/particle.tcc

tianchiTJ · 2020-08-18T17:10:46Z

@bodhinandach or @tianchiTJ or @jgiven100 would you be able to test the MPI scheme with a material model that has state variables (NorSand or MC)? Check with load balancing or any problem that involves migration of particles.

I test it by MC model, and I think the result is good.

jgiven100 · 2020-08-18T18:26:20Z

@kks32 NorSand test looks good

kks32 · 2020-08-18T20:24:03Z

Thanks @jgiven100 and @tianchiTJ for testing with state vars materials

ezrayst · 2020-08-20T07:22:58Z

@kks32, I would like to understand the data being presented.

The pack/unpack serialization in this PR is faster than the POD struct implementation for 2D sliding block with 4 MPI ranks. The results are an average of 5 different runs.

Schemes Avg Times (ms) SD (ms)
Pack/Unpack 13201 326
POD/Struct 13815 540

What is SD here? Previously you showed POD has 0.4 to 0.7 speedup as compared to Pack/Unpack but why is this result shows that POD takes longer? (I think I am missing something here, sorry).

kks32 · 2020-08-20T15:11:57Z

@kks32, I would like to understand the data being presented.

The pack/unpack serialization in this PR is faster than the POD struct implementation for 2D sliding block with 4 MPI ranks. The results are an average of 5 different runs.
Schemes Avg Times (ms) SD (ms)
Pack/Unpack 13201 326
POD/Struct 13815 540

What is SD here? Previously you showed POD has 0.4 to 0.7 speedup as compared to Pack/Unpack but why is this result shows that POD takes longer? (I think I am missing something here, sorry).

POD alone is insufficient as you need to register data with MPI_Type_Create_Struct. This adds additional run-time. Compared to our current implementation on develop, the Pack/Unpack is slightly faster. This is the best way to handle different particle types.

bodhinandach

Thanks for the awesome refactoring idea @kks32. Some comments from my side, I am working on the twophase and fluid particle

include/particles/particle.tcc

include/mpi_datatypes.h

Co-authored-by: Bodhinanda Chandra <[email protected]>

include/particles/particle.tcc

bodhinandach

This looks great to me @kks32. Similar implementation for two-phase has been implemented in PR #680 and tested working well with the dynamic load balancing. Thanks for addressing my comments too.

kks32 added 5 commits August 14, 2020 10:45

🚧 Serialize particle class

45a4954

💻 Serialize and deserialize with MPI

6504b30

💻 Refactor string for vector of uint8_t in serialization

18c66a5

💻 Write particle type to serialized objects

b333584

💻 Add material id and material ptr to deserialize

6774903

kks32 added Priority: High Status: In progress Type: Core feature Type: Refactor labels Aug 14, 2020

kks32 self-assigned this Aug 14, 2020

kks32 marked this pull request as draft August 14, 2020 21:55

kks32 requested review from cw646, ezrayst, jgiven100, srhgk2, thiagordonho, tianchiTJ and yliang-sn August 14, 2020 21:55

kks32 requested a review from bodhinandach August 15, 2020 13:24

:clock: Serialize-Deserialize benchmark against POD

7e749c2

kks32 added 6 commits August 17, 2020 11:09

🚧 MPI transfer halo particles with serialization

2feb3f1

🚧 Transfer non rank particles

ebe762e

🚧 Add tags to MPI Probe and fix MPI source

bde1bc4

🚧 Use MPI_ANY_SOURCE to get particles from all ranks instead of waiting

3a79134

🔨 Fix uninitialized material state vars

be14712

🔨 Fix MPI probe in transfer non-rank particles fixes #621

e26c368

kks32 marked this pull request as ready for review August 17, 2020 23:31

🔥 🎯 Remove mpi register types

020d0b8

tianchiTJ reviewed Aug 18, 2020

View reviewed changes

include/particles/particle.tcc Outdated Show resolved Hide resolved

kks32 added 2 commits August 19, 2020 10:04

🔁 Merge branch 'develop' into refactor/serialization

6543ab1

🔥 Remove unused variable in particle pack

f0a8a7b

kks32 added Status: Review needed Status: Completed and removed Status: In progress labels Aug 19, 2020

bodhinandach suggested changes Aug 22, 2020

View reviewed changes

bodhinandach reviewed Aug 22, 2020

View reviewed changes

include/mpi_datatypes.h Show resolved Hide resolved

kks32 and others added 3 commits August 22, 2020 06:22

🔨 Use mpm::ParticlePhase:Solid instead of id 0

d0951f4

Co-authored-by: Bodhinanda Chandra <[email protected]>

📝 Fix OpenMP Schedule in README

3005b11

💻 Use assertions to check rare failures

766ec3f

bodhinandach suggested changes Aug 23, 2020

View reviewed changes

include/particles/particle.tcc Outdated Show resolved Hide resolved

🔨 Fix assertion condition in deserialize particle type

6c1aef8

bodhinandach approved these changes Aug 25, 2020

View reviewed changes

tianchiTJ approved these changes Aug 25, 2020

View reviewed changes

kks32 merged commit 19570dd into develop Aug 26, 2020

kks32 deleted the refactor/serialization branch August 26, 2020 12:14

This was referenced Sep 22, 2020

Error in transfer_norank_particles #696

Closed

[Hotfix] Deserialize particle bug #698

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor MPI serialization #689

Refactor MPI serialization #689

kks32 commented Aug 14, 2020

bodhinandach commented Aug 15, 2020

kks32 commented Aug 17, 2020 •

edited

Loading

codecov bot commented Aug 17, 2020 •

edited

Loading

kks32 commented Aug 17, 2020

tianchiTJ commented Aug 18, 2020 •

edited

Loading

jgiven100 commented Aug 18, 2020

kks32 commented Aug 18, 2020

ezrayst commented Aug 20, 2020

kks32 commented Aug 20, 2020

bodhinandach left a comment

bodhinandach left a comment •

edited

Loading

Refactor MPI serialization #689

Refactor MPI serialization #689

Conversation

kks32 commented Aug 14, 2020

MPM Particle serialization

Summary

Motivation

Design Detail

Drawbacks

Rationale and Alternatives

Size

Time

Prior Art

Unresolved questions

Changelog

bodhinandach commented Aug 15, 2020

kks32 commented Aug 17, 2020 • edited Loading

codecov bot commented Aug 17, 2020 • edited Loading

Codecov Report

kks32 commented Aug 17, 2020

tianchiTJ commented Aug 18, 2020 • edited Loading

jgiven100 commented Aug 18, 2020

kks32 commented Aug 18, 2020

ezrayst commented Aug 20, 2020

kks32 commented Aug 20, 2020

bodhinandach left a comment

Choose a reason for hiding this comment

bodhinandach left a comment • edited Loading

Choose a reason for hiding this comment

kks32 commented Aug 17, 2020 •

edited

Loading

codecov bot commented Aug 17, 2020 •

edited

Loading

tianchiTJ commented Aug 18, 2020 •

edited

Loading

bodhinandach left a comment •

edited

Loading