v-iashin / SpecVQGAN Star 326 Code Issues Pull requests Discussions Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021) audio video pytorch transformer gan multi-modal evaluation-metrics video-understanding vas video-features vqvae bmvc melgan audio-generation vggsound Updated Jun 6, 2023 Jupyter Notebook
v-iashin / SparseSync Star 45 Code Issues Pull requests Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022) synchronization pytorch transformer lrs sparse multi-modal audio-visual bmvc vggsound Updated Jan 29, 2024 Python