This is my master thesis project about studying the possibility of transfer learning for data-driven motion generation frameworks. All necessary code for producing the results described in the thesis are provided here as it is.
Objective-driven motion generation model architecture
Rig-agnostic encoding approaches
The created motion data are exported from Unity as JSON files, which are parsed and extracted to Numpy arrays and stored as bzip2 compressed binary files.
The models are implemented in Python using Pytorch and PytorchLightning.
The implementation of the models are based on [MANN][1], [NSM][2], [LMPMoE][3], [MVAE][4] and [TRLSTM][5] The implemented models are tested with a small subset of the dataset, to verify the implementation. Ensuring that the reconstruction errors are optimised during the training, and the models are capable of generating correct animations. The hyperparameters such as the number of layers, the layer sizes and the learning rates are tuned using Ray Tune 4 with ASHA scheduler and a grid search algorithm.
The white character is playing the target animation. The blue character is the generated animation from the vanilla OMG model with limited training and data. The red character is from the warm-started OMG model with parameters from a pre-trained model that was previously trained on another rig. In this case, only the autoencoders are being optimized, meaning only the input and the output models are being trained. The green character is the same as the red one but also the core generation model is being trained.
In this case, the pose inputs to the OMG models only contain data for 6 key-joints (hands, feet, head and pelvis). OMG model is responsible to not only predict the next pose but also upscale it to the full resolution pose.
- Jupyter Notebooks - contains the notebooks for computing and plotting the results (assuming the models are trained and available).
- MLP with adversarial net - is the default Autoencoder (3-layer MLP) + an adversarial Conv-LSGAN model for providing the adversarial error of the generated poses.
- Clustering models - contains four variants of AE with an extra layer between the encoder and decoder for performing the clustering on the embeddings
- Experiments - contains code for training, validating and testing the various models
- func - contains miscellanenous functions for extracting, preparing data
- motion_generation_models - contains the various OMG models and MoGenNet
[1]: Zhang, He, Starke, Sebastian, Komura, Taku, and Saito, Jun. “Modeadaptive neural networks for quadruped motion control”. In: ACM Transactions on Graphics (TOG) 37.4 (2018), pp. 1–11. ISSN: 07300301. DOI: 10.1145/3197517. 3201366.
[2]: Starke, Sebastian, Zhang, He, Komura, Taku, and Saito, Jun. “Neural state machine for characterscene interactions”. In: ACM Transactions on Graphics (TOG) 38.6 (2019), pp. 1–14. ISSN: 07300301. DOI: 10.1145/3355089.3356505.
[3]: Starke, Sebastian, Zhao, Yiwei, Komura, Taku, and Zaman, Kazi. “Local motion phases for learning multicontact character movements”. In: ACM Transactions on Graphics (TOG) 39.4 (2020), 54:1–54:13. ISSN: 07300301. DOI: 10 . 1145 / 3386569.3392450.
[4]: Ling, Hung Yu, Zinno, Fabio, Cheng, George, and Panne, Michiel Van De. “Character controllers using motion VAEs”. In: ACM Transactions on Graphics (TOG) 39.4 (2020), 40:1–40:12. ISSN: 07300301. DOI: 10 . 1145 / 3386569 . 3392422.
[5]: Harvey, Félix G., Yurick, Mike, Nowrouzezahrai, Derek, and Pal, Christopher. “Robust motion inbetweening”. In: ACM Transactions on Graphics (TOG) 39.4 (2020), 60:1–60:12. ISSN: 07300301. DOI: 10.1145/3386569.3392480.