Multimodal DiT - Pytorch (wip) Implementation of a multimodal diffusion transformer in Pytorch If you are doing research on this topic, or have suggestions on recent findings to incorporate, please drop a line here