-
From Sora What We Can See: A Survey of Text-to-Video Generation,
arXiv, 2405.10674
, arxiv, pdf, cication: -1Rui Sun, Yumin Zhang, Tejal Shah, Jiahao Sun, Shuoying Zhang, Wenqi Li, Haoran Duan, Bo Wei, Rajiv Ranjan · (awesome-text-to-video-generation - soraw-ai)
-
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond,
arXiv, 2405.03520
, arxiv, pdf, cication: -1Zheng Zhu, Xiaofeng Wang, Wangbo Zhao, Chen Min, Nianchen Deng, Min Dou, Yuqi Wang, Botian Shi, Kai Wang, Chi Zhang · (General-World-Models-Survey - GigaAI-research)
-
Video Diffusion Models: A Survey,
arXiv, 2405.03150
, arxiv, pdf, cication: -1Andrew Melnik, Michal Ljubljanac, Cong Lu, Qi Yan, Weiming Ren, Helge Ritter
-
A Survey on Long Video Generation: Challenges, Methods, and Prospects,
arXiv, 2403.16407
, arxiv, pdf, cication: -1Chengxuan Li, Di Huang, Zeyu Lu, Yang Xiao, Qingqi Pei, Lei Bai
-
Sora as an AGI World Model? A Complete Survey on Text-to-Video Generation,
arXiv, 2403.05131
, arxiv, pdf, cication: -1Joseph Cho, Fachrina Dewi Puspitasari, Sheng Zheng, Jingyao Zheng, Lik-Hang Lee, Tae-Ho Kim, Choong Seon Hong, Chaoning Zhang
-
sorareview - lichao-sun
The official GitHub page for the review paper "Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models".
-
A Survey on Generative Diffusion Model,
ieee transactions on knowledge and data engineering, 2024
, arxiv, pdf, cication: 121Hanqun Cao, Cheng Tan, Zhangyang Gao, Yilun Xu, Guangyong Chen, Pheng-Ann Heng, Stan Z. Li · (A-Survey-on-Generative-Diffusion-Model - chq1155) · (jiqizhixin)
-
VideoTetris: Towards Compositional Text-to-Video Generation,
arXiv, 2406.04277
, arxiv, pdf, cication: -1Ye Tian, Ling Yang, Haotian Yang, Yuan Gao, Yufan Deng, Jingmin Chen, Xintao Wang, Zhaochen Yu, Xin Tao, Pengfei Wan
· (VideoTetris - YangLing0818)
-
Searching Priors Makes Text-to-Video Synthesis Better,
arXiv, 2406.03215
, arxiv, pdf, cication: -1Haoran Cheng, Liang Peng, Linxuan Xia, Yuepeng Hu, Hengjia Li, Qinglin Lu, Xiaofei He, Boxi Wu
-
SF-V: Single Forward Video Generation Model,
arXiv, 2406.04324
, arxiv, pdf, cication: -1Zhixing Zhang, Yanyu Li, Yushu Wu, Yanwu Xu, Anil Kag, Ivan Skorokhodov, Willi Menapace, Aliaksandr Siarohin, Junli Cao, Dimitris Metaxas
-
ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation,
arXiv, 2406.00908
, arxiv, pdf, cication: -1Shaoshu Yang, Yong Zhang, Xiaodong Cun, Ying Shan, Ran He
-
T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback,
arXiv, 2405.18750
, arxiv, pdf, cication: -1Jiachen Li, Weixi Feng, Tsu-Jui Fu, Xinyi Wang, Sugato Basu, Wenhu Chen, William Yang Wang
-
Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control,
arXiv, 2405.17414
, arxiv, pdf, cication: -1Zhengfei Kuang, Shengqu Cai, Hao He, Yinghao Xu, Hongsheng Li, Leonidas Guibas, Gordon Wetzstein · (collaborativevideodiffusion.github)
-
FIFO-Diffusion: Generating Infinite Videos from Text without Training,
arXiv, 2405.11473
, arxiv, pdf, cication: -1Jihwan Kim, Junoh Kang, Jinyoung Choi, Bohyung Han
-
Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers,
arXiv, 2405.05945
, arxiv, pdf, cication: -1Peng Gao, Le Zhuo, Ziyi Lin, Chris Liu, Junsong Chen, Ruoyi Du, Enze Xie, Xu Luo, Longtian Qiu, Yuhang Zhang · (Lumina-T2X - Alpha-VLLM)
-
Vidu: a Highly Consistent, Dynamic and Skilled Text-to-Video Generator with Diffusion Models,
arXiv, 2405.04233
, arxiv, pdf, cication: -1Fan Bao, Chendong Xiang, Gang Yue, Guande He, Hongzhou Zhu, Kaiwen Zheng, Min Zhao, Shilong Liu, Yaole Wang, Jun Zhu · (shengshu-ai)
-
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation,
arXiv, 2405.01434
, arxiv, pdf, cication: -1Yupeng Zhou, Daquan Zhou, Ming-Ming Cheng, Jiashi Feng, Qibin Hou · (storydiffusion - hvision-nku) · (storydiffusion.github)
-
MotionMaster: Training-free Camera Motion Transfer For Video Generation,
arXiv, 2404.15789
, arxiv, pdf, cication: -1Teng Hu, Jiangning Zhang, Ran Yi, Yating Wang, Hongrui Huang, Jieyu Weng, Yabiao Wang, Lizhuang Ma
· (sjtuplayer.github) · (MotionMaster - sjtuplayer)
-
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators,
arXiv, 2404.05014
, arxiv, pdf, cication: -1Shenghai Yuan, Jinfa Huang, Yujun Shi, Yongqi Xu, Ruijie Zhu, Bin Lin, Xinhua Cheng, Li Yuan, Jiebo Luo
-
CameraCtrl: Enabling Camera Control for Text-to-Video Generation,
arXiv, 2404.02101
, arxiv, pdf, cication: -1Hao He, Yinghao Xu, Yuwei Guo, Gordon Wetzstein, Bo Dai, Hongsheng Li, Ceyuan Yang · (hehao13.github)
-
Grid Diffusion Models for Text-to-Video Generation,
arXiv, 2404.00234
, arxiv, pdf, cication: -1Taegyeong Lee, Soyeong Kwon, Taehwan Kim
-
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text,
arXiv, 2403.14773
, arxiv, pdf, cication: -1Roberto Henschel, Levon Khachatryan, Daniil Hayrapetyan, Hayk Poghosyan, Vahram Tadevosyan, Zhangyang Wang, Shant Navasardyan, Humphrey Shi · (StreamingT2V - Picsart-AI-Research)
· (mp.weixin.qq)
-
Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition,
arXiv, 2403.14148
, arxiv, pdf, cication: -1Sihyun Yu, Weili Nie, De-An Huang, Boyi Li, Jinwoo Shin, Anima Anandkumar
-
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework,
arXiv, 2403.13248
, arxiv, pdf, cication: -1Zhengqing Yuan, Ruoxi Chen, Zhaoxu Li, Haolong Jia, Lifang He, Chi Wang, Lichao Sun · (Mora - lichao-sun)
-
AesopAgent: Agent-driven Evolutionary System on Story-to-Video Production,
arXiv, 2403.07952
, arxiv, pdf, cication: -1Jiuniu Wang, Zehua Du, Yuyuan Zhao, Bo Yuan, Kexiang Wang, Jian Liang, Yaxi Zhao, Yihen Lu, Gengliang Li, Junlong Gao
-
VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models,
arXiv, 2403.05438
, arxiv, pdf, cication: -1Yabo Zhang, Yuxiang Wei, Xianhui Lin, Zheng Hui, Peiran Ren, Xuansong Xie, Xiangyang Ji, Wangmeng Zuo · (VideoElevator - YBYBZhang)
-
Pix2Gif: Motion-Guided Diffusion for GIF Generation,
arXiv, 2403.04634
, arxiv, pdf, cication: -1Hitesh Kandala, Jianfeng Gao, Jianwei Yang
-
Open-Sora - hpcaitech
Building your own video generation model like OpenAI's Sora
-
Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation,
arXiv, 2403.02827
, arxiv, pdf, cication: -1Weijie Li, Litong Gong, Yiran Zhu, Fanda Fan, Biao Wang, Tiezheng Ge, Bo Zheng · (noise-rectification.github)
-
Open-Sora-Plan - PKU-YuanGroup
This project aim to reproducing Sora (Open AI T2V model), but we only have limited resource. We deeply wish the all open source community can contribute to this project.
-
Sora Generates Videos with Stunning Geometrical Consistency,
arXiv, 2402.17403
, arxiv, pdf, cication: -1Xuanyi Li, Daquan Zhou, Chenxu Zhang, Shaodong Wei, Qibin Hou, Ming-Ming Cheng · (sora-geometrical-consistency.github)
-
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models,
arXiv, 2402.17177
, arxiv, pdf, cication: -1Yixin Liu, Kai Zhang, Yuan Li, Zhiling Yan, Chujie Gao, Ruoxi Chen, Zhengqing Yuan, Yue Huang, Hanchi Sun, Jianfeng Gao
-
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis,
arXiv, 2402.14797
, arxiv, pdf, cication: -1Willi Menapace, Aliaksandr Siarohin, Ivan Skorokhodov, Ekaterina Deyneka, Tsai-Shien Chen, Anil Kag, Yuwei Fang, Aleksei Stoliar, Elisa Ricci, Jian Ren · (snap-research.github)
-
AnimateLCM-SVD-xt - wangfuyun 🤗
-
Magic-Me: Identity-Specific Video Customized Diffusion,
arXiv, 2402.09368
, arxiv, pdf, cication: -1Ze Ma, Daquan Zhou, Chun-Hsiao Yeh, Xue-She Wang, Xiuyu Li, Huanrui Yang, Zhen Dong, Kurt Keutzer, Jiashi Feng · (Magic-Me - Zhen-Dong)
-
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation,
arXiv, 2402.04324
, arxiv, pdf, cication: -1Weiming Ren, Harry Yang, Ge Zhang, Cong Wei, Xinrun Du, Stephen Huang, Wenhu Chen
-
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization,
arXiv, 2402.03161
, arxiv, pdf, cication: -1Yang Jin, Zhicheng Sun, Kun Xu, Kun Xu, Liwei Chen, Hao Jiang, Quzhe Huang, Chengru Song, Yuliang Liu, Di Zhang · (video-lavit.github)
-
Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion,
arXiv, 2402.03162
, arxiv, pdf, cication: -1Shiyuan Yang, Liang Hou, Haibin Huang, Chongyang Ma, Pengfei Wan, Di Zhang, Xiaodong Chen, Jing Liao · (direct-a-video.github)
-
Boximator: Generating Rich and Controllable Motions for Video Synthesis,
arXiv, 2402.01566
, arxiv, pdf, cication: -1Jiawei Wang, Yuchen Zhang, Jiaxin Zou, Yan Zeng, Guoqiang Wei, Liping Yuan, Hang Li
-
InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions,
arXiv, 2402.03040
, arxiv, pdf, cication: -1Yiyuan Zhang, Yuhao Kang, Zhixin Zhang, Xiaohan Ding, Sanyuan Zhao, Xiangyu Yue · (InteractiveVideo - invictus717)
-
AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning,
arXiv, 2402.00769
, arxiv, pdf, cication: -1Fu-Yun Wang, Zhaoyang Huang, Xiaoyu Shi, Weikang Bian, Guanglu Song, Yu Liu, Hongsheng Li · (AnimateLCM - G-U-N)
-
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models,
arXiv, 2401.09047
, arxiv, pdf, cication: -1Haoxin Chen, Yong Zhang, Xiaodong Cun, Menghan Xia, Xintao Wang, Chao Weng, Ying Shan · (VideoCrafter - AILab-CVC)
-
Lumiere: A Space-Time Diffusion Model for Video Generation,
arXiv, 2401.12945
, arxiv, pdf, cication: -1Omer Bar-Tal, Hila Chefer, Omer Tov, Charles Herrmann, Roni Paiss, Shiran Zada, Ariel Ephrat, Junhwa Hur, Yuanzhen Li, Tomer Michaeli
-
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution,
proceedings of the ieee/cvf winter conference on applications …, 2024
, arxiv, pdf, cication: -1Xin Yuan, Jinoo Baek, Keyang Xu, Omer Tov, Hongliang Fei
-
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects,
arXiv, 2401.09962
, arxiv, pdf, cication: -1Zhao Wang, Aoxue Li, Enze Xie, Lingting Zhu, Yong Guo, Qi Dou, Zhenguo Li
-
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens,
arXiv, 2401.09985
, arxiv, pdf, cication: -1Xiaofeng Wang, Zheng Zhu, Guan Huang, Boyuan Wang, Xinze Chen, Jiwen Lu
-
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models,
arXiv, 2401.09047
, arxiv, pdf, cication: -1Haoxin Chen, Yong Zhang, Xiaodong Cun, Menghan Xia, Xintao Wang, Chao Weng, Ying Shan · (VideoCrafter - AILab-CVC) · (ailab-cvc.github) · (huggingface)
-
UniVG: Towards UNIfied-modal Video Generation,
arXiv, 2401.09084
, arxiv, pdf, cication: -1Ludan Ruan, Lei Tian, Chuanwei Huang, Xu Zhang, Xinyan Xiao
-
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models,
arXiv, 2401.09047
, arxiv, pdf, cication: -1Haoxin Chen, Yong Zhang, Xiaodong Cun, Menghan Xia, Xintao Wang, Chao Weng, Ying Shan · (VideoCrafter - AILab-CVC) · (ailab-cvc.github)
-
Vlogger: Make Your Dream A Vlog,
arXiv, 2401.09414
, arxiv, pdf, cication: -1Shaobin Zhuang, Kunchang Li, Xinyuan Chen, Yaohui Wang, Ziwei Liu, Yu Qiao, Yali Wang · (Vlogger - zhuangshaobin)
-
UniVG: Towards UNIfied-modal Video Generation,
arXiv, 2401.09084
, arxiv, pdf, cication: -1Ludan Ruan, Lei Tian, Chuanwei Huang, Xu Zhang, Xinyan Xiao · (univg-baidu.github)
-
Towards A Better Metric for Text-to-Video Generation,
arXiv, 2401.07781
, arxiv, pdf, cication: -1Jay Zhangjie Wu, Guian Fang, Haoning Wu, Xintao Wang, Yixiao Ge, Xiaodong Cun, David Junhao Zhang, Jia-Wei Liu, Yuchao Gu, Rui Zhao
-
Latte: Latent Diffusion Transformer for Video Generation,
arXiv, 2401.03048
, arxiv, pdf, cication: -1Xin Ma, Yaohui Wang, Gengyun Jia, Xinyuan Chen, Ziwei Liu, Yuan-Fang Li, Cunjian Chen, Yu Qiao · (maxin-cn.github) · (Latte - maxin-cn)
-
MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation,
arXiv, 2401.04468
, arxiv, pdf, cication: -1Weimin Wang, Jiawei Liu, Zhijie Lin, Jiangqiao Yan, Shuo Chen, Chetwin Low, Tuyen Hoang, Jie Wu, Jun Hao Liew, Hanshu Yan · (magicvideov2.github)
-
AIGCBench: Comprehensive Evaluation of Image-to-Video Content Generated by AI,
arXiv, 2401.01651
, arxiv, pdf, cication: -1Fanda Fan, Chunjie Luo, Wanling Gao, Jianfeng Zhan
-
Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions,
arXiv, 2401.01827
, arxiv, pdf, cication: -1David Junhao Zhang, Dongxu Li, Hung Le, Mike Zheng Shou, Caiming Xiong, Doyen Sahoo · (LAVIS - salesforce)
-
VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM,
arXiv, 2401.01256
, arxiv, pdf, cication: -1Fuchen Long, Zhaofan Qiu, Ting Yao, Tao Mei
-
TrailBlazer: Trajectory Control for Diffusion-Based Video Generation,
arXiv, 2401.00896
, arxiv, pdf, cication: -1Wan-Duo Kurt Ma, J. P. Lewis, W. Bastiaan Kleijn
-
I2V-Adapter: A General Image-to-Video Adapter for Diffusion Models,
arXiv, 2312.16693
, arxiv, pdf, cication: -1Xun Guo, Mingwu Zheng, Liang Hou, Yuan Gao, Yufan Deng, Pengfei Wan, Di Zhang, Yufan Liu, Weiming Hu, Zhengjun Zha
-
A Recipe for Scaling up Text-to-Video Generation with Text-free Videos,
arXiv, 2312.15770
, arxiv, pdf, cication: -1Xiang Wang, Shiwei Zhang, Hangjie Yuan, Zhiwu Qing, Biao Gong, Yingya Zhang, Yujun Shen, Changxin Gao, Nong Sang · (tf-t2v.github)
-
MotionCtrl: A Unified and Flexible Motion Controller for Video Generation,
arXiv, 2312.03641
, arxiv, pdf, cication: 1Zhouxia Wang, Ziyang Yuan, Xintao Wang, Tianshui Chen, Menghan Xia, Ping Luo, Ying Shan · (MotionCtrl - TencentARC)
-
Generative AI Beyond LLMs: System Implications of Multi-Modal Generation,
arXiv, 2312.14385
, arxiv, pdf, cication: -1Alicia Golden, Samuel Hsia, Fei Sun, Bilge Acun, Basil Hosmer, Yejin Lee, Zachary DeVito, Jeff Johnson, Gu-Yeon Wei, David Brooks
-
InstructVideo: Instructing Video Diffusion Models with Human Feedback,
arXiv, 2312.12490
, arxiv, pdf, cication: -1Hangjie Yuan, Shiwei Zhang, Xiang Wang, Yujie Wei, Tao Feng, Yining Pan, Yingya Zhang, Ziwei Liu, Samuel Albanie, Dong Ni
-
VideoPoet: A Large Language Model for Zero-Shot Video Generation,
arXiv, 2312.14125
, arxiv, pdf, cication: 3Dan Kondratyuk, Lijun Yu, Xiuye Gu, José Lezama, Jonathan Huang, Rachel Hornung, Hartwig Adam, Hassan Akbari, Yair Alon, Vighnesh Birodkar · (blog.research)
-
MagicScroll: Nontypical Aspect-Ratio Image Generation for Visual Storytelling via Multi-Layered Semantic-Aware Denoising,
arXiv, 2312.10899
, arxiv, pdf, cication: -1Bingyuan Wang, Hengyu Meng, Zeyu Cai, Lanjiong Li, Yue Ma, Qifeng Chen, Zeyu Wang
-
VideoLCM: Video Latent Consistency Model,
arXiv, 2312.09109
, arxiv, pdf, cication: 2Xiang Wang, Shiwei Zhang, Han Zhang, Yu Liu, Yingya Zhang, Changxin Gao, Nong Sang
-
DreaMoving: A Human Video Generation Framework based on Diffusion Models,
arXiv, 2312.05107
, arxiv, pdf, cication: -1Mengyang Feng, Jinlin Liu, Kai Yu, Yuan Yao, Zheng Hui, Xiefan Guo, Xianhui Lin, Haolan Xue, Chen Shi, Xiaowen Li · (dreamoving-project - dreamoving)
-
PEEKABOO: Interactive Video Generation via Masked-Diffusion,
arXiv, 2312.07509
, arxiv, pdf, cication: -1Yash Jain, Anshul Nasery, Vibhav Vineet, Harkirat Behl
-
FreeInit: Bridging Initialization Gap in Video Diffusion Models,
arXiv, 2312.07537
, arxiv, pdf, cication: -1Tianxing Wu, Chenyang Si, Yuming Jiang, Ziqi Huang, Ziwei Liu · (FreeInit - TianxingWu) · (tianxingwu.github)
-
Photorealistic Video Generation with Diffusion Models,
arXiv, 2312.06662
, arxiv, pdf, cication: 2Agrim Gupta, Lijun Yu, Kihyuk Sohn, Xiuye Gu, Meera Hahn, Li Fei-Fei, Irfan Essa, Lu Jiang, José Lezama · (walt-video-diffusion.github) · (walt-video-diffusion.github)
-
Customizing Motion in Text-to-Video Diffusion Models,
arXiv, 2312.04966
, arxiv, pdf, cication: -1Joanna Materzynska, Josef Sivic, Eli Shechtman, Antonio Torralba, Richard Zhang, Bryan Russell
-
DreamVideo: High-Fidelity Image-to-Video Generation with Image Retention and Text Guidance,
arXiv, 2312.03018
, arxiv, pdf, cication: -1Cong Wang, Jiaxi Gu, Panwen Hu, Songcen Xu, Hang Xu, Xiaodan Liang
-
StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter,
arXiv, 2312.00330
, arxiv, pdf, cication: -1Gongye Liu, Menghan Xia, Yong Zhang, Haoxin Chen, Jinbo Xing, Xintao Wang, Yujiu Yang, Ying Shan
-
BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models,
arXiv, 2312.02813
, arxiv, pdf, cication: -1Fengyuan Shi, Jiaxi Gu, Hang Xu, Songcen Xu, Wei Zhang, Limin Wang
-
DreamVideo: Composing Your Dream Videos with Customized Subject and Motion,
arXiv, 2312.04433
, arxiv, pdf, cication: 3Yujie Wei, Shiwei Zhang, Zhiwu Qing, Hangjie Yuan, Zhiheng Liu, Yu Liu, Yingya Zhang, Jingren Zhou, Hongming Shan · (dreamvideo-t2v.github)
-
Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation,
arXiv, 2312.04483
, arxiv, pdf, cication: 2Zhiwu Qing, Shiwei Zhang, Jiayu Wang, Xiang Wang, Yujie Wei, Yingya Zhang, Changxin Gao, Nong Sang · (higen-t2v.github)
-
GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation,
arXiv, 2312.04557
, arxiv, pdf, cication: -1Shoufa Chen, Mengmeng Xu, Jiawei Ren, Yuren Cong, Sen He, Yanping Xie, Animesh Sinha, Ping Luo, Tao Xiang, Juan-Manuel Perez-Rua
-
MotionCtrl: A Unified and Flexible Motion Controller for Video Generation,
arXiv, 2312.03641
, arxiv, pdf, cication: 1Zhouxia Wang, Ziyang Yuan, Xintao Wang, Tianshui Chen, Menghan Xia, Ping Luo, Ying Shan
-
Fine-grained Controllable Video Generation via Object Appearance and Context,
arXiv, 2312.02919
, arxiv, pdf, cication: -1Hsin-Ping Huang, Yu-Chuan Su, Deqing Sun, Lu Jiang, Xuhui Jia, Yukun Zhu, Ming-Hsuan Yang
-
VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models,
arXiv, 2312.00845
, arxiv, pdf, cication: -1Hyeonho Jeong, Geon Yeong Park, Jong Chul Ye · (Video-Motion-Customization - HyeonHo99) · (video-motion-customization.github)
-
VideoBooth: Diffusion-based Video Generation with Image Prompts,
arXiv, 2312.00777
, arxiv, pdf, cication: -1Yuming Jiang, Tianxing Wu, Shuai Yang, Chenyang Si, Dahua Lin, Yu Qiao, Chen Change Loy, Ziwei Liu
-
MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation,
arXiv, 2311.18829
, arxiv, pdf, cication: -1Yanhui Wang, Jianmin Bao, Wenming Weng, Ruoyu Feng, Dacheng Yin, Tao Yang, Jingxu Zhang, Qi Dai Zhiyuan Zhao, Chunyu Wang, Kai Qiu
-
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models,
arXiv, 2311.04145
, arxiv, pdf, cication: 14Shiwei Zhang, Jiayu Wang, Yingya Zhang, Kang Zhao, Hangjie Yuan, Zhiwu Qin, Xiang Wang, Deli Zhao, Jingren Zhou · (i2vgen-xl.github) · (huggingface) · (i2vgen-xl - damo-vilab)
-
FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline,
arXiv, 2311.13073
, arxiv, pdf, cication: -1Vladimir Arkhipkin, Zein Shaheen, Viacheslav Vasilev, Elizaveta Dakhova, Andrey Kuznetsov, Denis Dimitrov · (ai-forever.github) · (kandinskyvideo - ai-forever)
-
· (huggingface) · (generative-models - Stability-AI) · (huggingface)
-
MoVideo: Motion-Aware Video Generation with Diffusion Models,
arXiv, 2311.11325
, arxiv, pdf, cication: -1Jingyun Liang, Yuchen Fan, Kai Zhang, Radu Timofte, Luc Van Gool, Rakesh Ranjan · (jingyunliang.github)
-
Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning,
arXiv, 2311.10709
, arxiv, pdf, cication: 2Rohit Girdhar, Mannat Singh, Andrew Brown, Quentin Duval, Samaneh Azadi, Sai Saketh Rambhatla, Akbar Shah, Xi Yin, Devi Parikh, Ishan Misra · (emu-video.metademolab) · (emu-video.metademolab)
-
FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation,
arXiv, 2311.01813
, arxiv, pdf, cication: 3Yuanxin Liu, Lei Li, Shuhuai Ren, Rundong Gao, Shicheng Li, Sishuo Chen, Xu Sun, Lu Hou · (FETV - llyx97)
-
FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling,
arXiv, 2310.15169
, arxiv, pdf, cication: 3Haonan Qiu, Menghan Xia, Yong Zhang, Yingqing He, Xintao Wang, Ying Shan, Ziwei Liu · (qbitai)
-
MotionDirector: Motion Customization of Text-to-Video Diffusion Models,
arXiv, 2310.08465
, arxiv, pdf, cication: 9Rui Zhao, Yuchao Gu, Jay Zhangjie Wu, David Junhao Zhang, Jiawei Liu, Weijia Wu, Jussi Keppo, Mike Zheng Shou · (MotionDirector - showlab)
-
LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models,
arXiv, 2309.15103
, arxiv, pdf, cication: 23Yaohui Wang, Xinyuan Chen, Xin Ma, Shangchen Zhou, Ziqi Huang, Yi Wang, Ceyuan Yang, Yinan He, Jiashuo Yu, Peiqing Yang · (LaVie - Vchitect)
-
UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation,
arXiv, 2406.01188
, arxiv, pdf, cication: -1Xiang Wang, Shiwei Zhang, Changxin Gao, Jiayu Wang, Xiaoqiang Zhou, Yingya Zhang, Luxin Yan, Nong Sang · (UniAnimate - ali-vilab) · (unianimate.github)
-
ToonCrafter - ToonCrafter
a research paper for generative cartoon interpolation
-
SignLLM: Sign Languages Production Large Language Models,
arXiv, 2405.10718
, arxiv, pdf, cication: -1Sen Fang, Lei Wang, Ce Zheng, Yapeng Tian, Chen Chen · (signllm.github)
-
MusePose - TMElyralab
MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation
-
MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model,
arXiv, 2405.20222
, arxiv, pdf, cication: -1Muyao Niu, Xiaodong Cun, Xintao Wang, Yong Zhang, Ying Shan, Yinqiang Zheng
-
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture,
arXiv, 2405.18991
, arxiv, pdf, cication: -1Jiaqi Xu, Xinyi Zou, Kunzhe Huang, Yunkuo Chen, Bo Liu, MengLi Cheng, Xing Shi, Jun Huang
-
VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation,
arXiv, 2405.18156
, arxiv, pdf, cication: -1Qilin Wang, Zhengkai Jiang, Chengming Xu, Jiangning Zhang, Yabiao Wang, Xinyi Zhang, Yun Cao, Weijian Cao, Chengjie Wang, Yanwei Fu
-
CamViG: Camera Aware Image-to-Video Generation with Multimodal Transformers,
arXiv, 2405.13195
, arxiv, pdf, cication: -1Andrew Marmon, Grant Schindler, José Lezama, Dan Kondratyuk, Bryan Seybold, Irfan Essa
-
Edit-Your-Motion: Space-Time Diffusion Decoupling Learning for Video Motion Editing,
arXiv, 2405.04496
, arxiv, pdf, cication: -1Yi Zuo, Lingling Li, Licheng Jiao, Fang Liu, Xu Liu, Wenping Ma, Shuyuan Yang, Yuwei Guo
-
ID-Animator: Zero-Shot Identity-Preserving Human Video Generation,
arXiv, 2404.15275
, arxiv, pdf, cication: -1Xuanhua He, Quande Liu, Shengju Qian, Xin Wang, Tao Hu, Ke Cao, Keyu Yan, Man Zhou, Jie Zhang · (id-animator.github) · (ID-Animator - ID-Animator)
-
Dynamic Typography: Bringing Text to Life via Video Diffusion Prior,
arXiv, 2404.11614
, arxiv, pdf, cication: -1Zichen Liu, Yihao Meng, Hao Ouyang, Yue Yu, Bolin Zhao, Daniel Cohen-Or, Huamin Qu · (animate-your-word.github)
-
AniClipart: Clipart Animation with Text-to-Video Priors,
arXiv, 2404.12347
, arxiv, pdf, cication: -1Ronghuan Wu, Wanchao Su, Kede Ma, Jing Liao
-
AnimateZoo: Zero-shot Video Generation of Cross-Species Animation via Subject Alignment,
arXiv, 2404.04946
, arxiv, pdf, cication: -1Yuanfeng Xu, Yuhao Chen, Zhongzhan Huang, Zijian He, Guangrun Wang, Philip Torr, Liang Lin · (AnimateZoo - JustinXu0) · (justinxu0.github)
-
TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models,
arXiv, 2403.17005
, arxiv, pdf, cication: -1Zhongwei Zhang, Fuchen Long, Yingwei Pan, Zhaofan Qiu, Ting Yao, Yang Cao, Tao Mei
· (trip-i2v.github)
-
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance,
arXiv, 2403.14781
, arxiv, pdf, cication: -1Shenhao Zhu, Junming Leo Chen, Zuozhuo Dai, Yinghui Xu, Xun Cao, Yao Yao, Hao Zhu, Siyu Zhu · (fudan-generative-vision.github) · (champ - fudan-generative-vision)
-
Explorative Inbetweening of Time and Space,
arXiv, 2403.14611
, arxiv, pdf, cication: -1Haiwen Feng, Zheng Ding, Zhihao Xia, Simon Niklaus, Victoria Abrevaya, Michael J. Black, Xuaner Zhang · (time-reversal.github)
-
StyleCineGAN: Landscape Cinemagraph Generation using a Pre-trained StyleGAN,
arXiv, 2403.14186
, arxiv, pdf, cication: -1Jongwoo Choi, Kwanggyoon Seo, Amirsaman Ashtari, Junyong Noh
-
AnimateDiff-Lightning: Cross-Model Diffusion Distillation,
arXiv, 2403.12706
, arxiv, pdf, cication: -1Shanchuan Lin, Xiao Yang
· (huggingface)
-
Animate Your Motion: Turning Still Images into Dynamic Videos,
arXiv, 2403.10179
, arxiv, pdf, cication: -1Mingxiao Li, Bo Wan, Marie-Francine Moens, Tinne Tuytelaars
-
WorldGPT: A Sora-Inspired Video AI Agent as Rich World Models from Text and Image Inputs,
arXiv, 2403.07944
, arxiv, pdf, cication: -1Deshun Yang, Luhui Hu, Yu Tian, Zihao Li, Chris Kelly, Bang Yang, Cindy Yang, Yuexian Zou
-
Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts,
arXiv, 2403.08268
, arxiv, pdf, cication: -1Yue Ma, Yingqing He, Hongfa Wang, Andong Wang, Chenyang Qi, Chengfei Cai, Xiu Li, Zhifeng Li, Heung-Yeung Shum, Wei Liu · (follow-your-click.github)
-
Audio-Synchronized Visual Animation,
arXiv, 2403.05659
, arxiv, pdf, cication: -1Lin Zhang, Shentong Mo, Yijing Zhang, Pedro Morgado
· (lzhangbj.github)
-
DragAnything: Motion Control for Anything using Entity Representation,
arXiv, 2403.07420
, arxiv, pdf, cication: -1Weijia Wu, Zhuang Li, Yuchao Gu, Rui Zhao, Yefei He, David Junhao Zhang, Mike Zheng Shou, Yan Li, Tingting Gao, Di Zhang
-
AtomoVideo: High Fidelity Image-to-Video Generation,
arXiv, 2403.01800
, arxiv, pdf, cication: -1Litong Gong, Yiran Zhu, Weijie Li, Xiaoyang Kang, Biao Wang, Tiezheng Ge, Bo Zheng · (atomo-video.github)
-
Animated Stickers: Bringing Stickers to Life with Video Diffusion,
arXiv, 2402.06088
, arxiv, pdf, cication: -1David Yan, Winnie Zhang, Luxin Zhang, Anmol Kalia, Dingkang Wang, Ankit Ramchandani, Miao Liu, Albert Pumarola, Edgar Schoenfeld, Elliot Blanchard
-
Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling,
arXiv, 2401.15977
, arxiv, pdf, cication: -1Xiaoyu Shi, Zhaoyang Huang, Fu-Yun Wang, Weikang Bian, Dasong Li, Yi Zhang, Manyuan Zhang, Ka Chun Cheung, Simon See, Hongwei Qin
-
Do You Guys Want to Dance: Zero-Shot Compositional Human Dance Generation with Multiple Persons,
arXiv, 2401.13363
, arxiv, pdf, cication: -1Zhe Xu, Kun Wei, Xu Yang, Cheng Deng
-
Synthesizing Moving People with 3D Control,
arXiv, 2401.10889
, arxiv, pdf, cication: -1Boyi Li, Jathushan Rajasegaran, Yossi Gandelsman, Alexei A. Efros, Jitendra Malik · (boyiliee.github)
-
Continuous Piecewise-Affine Based Motion Model for Image Animation,
arXiv, 2401.09146
, arxiv, pdf, cication: -1Hexiang Wang, Fengqi Liu, Qianyu Zhou, Ran Yi, Xin Tan, Lizhuang Ma
-
Moore-AnimateAnyone - MooreThreads
-
LongAnimateDiff - Lightricks
· (huggingface)
-
WonderJourney: Going from Anywhere to Everywhere,
arXiv, 2312.03884
, arxiv, pdf, cication: -1Hong-Xing Yu, Haoyi Duan, Junhwa Hur, Kyle Sargent, Michael Rubinstein, William T. Freeman, Forrester Cole, Deqing Sun, Noah Snavely, Jiajun Wu · (WonderJourney - KovenYu) · (kovenyu) · (mp.weixin.qq)
-
Animate124: Animating One Image to 4D Dynamic Scene,
arXiv, 2311.14603
, arxiv, pdf, cication: 2Yuyang Zhao, Zhiwen Yan, Enze Xie, Lanqing Hong, Zhenguo Li, Gim Hee Lee · (Animate124 - HeliosZhao) · (animate124.github)
-
DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors,
arXiv, 2310.12190
, arxiv, pdf, cication: 8Jinbo Xing, Menghan Xia, Yong Zhang, Haoxin Chen, Wangbo Yu, Hanyuan Liu, Xintao Wang, Tien-Tsin Wong, Ying Shan · (doubiiu.github) · (DynamiCrafter - Doubiiu) · (huggingface)
· (huggingface)
· (huggingface)
-
AnimateZero: Video Diffusion Models are Zero-Shot Image Animators,
arXiv, 2312.03793
, arxiv, pdf, cication: -1Jiwen Yu, Xiaodong Cun, Chenyang Qi, Yong Zhang, Xintao Wang, Ying Shan, Jian Zhang · (vvictoryuki.github)
-
LivePhoto: Real Image Animation with Text-guided Motion Control,
arXiv, 2312.02928
, arxiv, pdf, cication: 1Xi Chen, Zhiheng Liu, Mengting Chen, Yutong Feng, Yu Liu, Yujun Shen, Hengshuang Zhao · (LivePhoto - XavierCHEN34) · (xavierchen34.github)
-
Generative Rendering: Controllable 4D-Guided Video Generation with 2D Diffusion Models,
arXiv, 2312.01409
, arxiv, pdf, cication: -1Shengqu Cai, Duygu Ceylan, Matheus Gadelha, Chun-Hao Paul Huang, Tuanfeng Yang Wang, Gordon Wetzstein
-
PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models,
arXiv, 2312.13964
, arxiv, pdf, cication: 4Yiming Zhang, Zhening Xing, Yanhong Zeng, Youqing Fang, Kai Chen · (PIA - open-mmlab)
-
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model,
arXiv, 2311.16498
, arxiv, pdf, cication: -1Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Hanshu Yan, Jia-Wei Liu, Chenxu Zhang, Jiashi Feng, Mike Zheng Shou · (magic-animate - magic-research) · (huggingface) · (jiqizhixin) · (magic-animate-for-windows - sdbds)
-
Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation,
arXiv, 2311.17117
, arxiv, pdf, cication: 1Li Hu, Xin Gao, Peng Zhang, Ke Sun, Bang Zhang, Liefeng Bo · (AnimateAnyone - HumanAIGC) · (AnimateAnyone-unofficial - guoqincode)
-
SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction,
arXiv, 2310.20700
, arxiv, pdf, cication: 7Xinyuan Chen, Yaohui Wang, Lingjun Zhang, Shaobin Zhuang, Xin Ma, Jiashuo Yu, Yali Wang, Dahua Lin, Yu Qiao, Ziwei Liu · (SEINE - Vchitect) · (huggingface)
-
MagicDance: Realistic Human Dance Video Generation with Motions & Facial Expressions Transfer,
arXiv, 2311.12052
, arxiv, pdf, cication: -1Di Chang, Yichun Shi, Quankai Gao, Jessica Fu, Hongyi Xu, Guoxian Song, Qing Yan, Xiao Yang, Mohammad Soleymani
-
Make Pixels Dance: High-Dynamic Video Generation,
arXiv, 2311.10982
, arxiv, pdf, cication: 5Yan Zeng, Guoqiang Wei, Jiani Zheng, Jiaxin Zou, Yang Wei, Yuchen Zhang, Hang Li · (makepixelsdance.github)
-
DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory,
arXiv, 2308.08089
, arxiv, pdf, cication: 17Shengming Yin, Chenfei Wu, Jian Liang, Jie Shi, Houqiang Li, Gong Ming, Nan Duan · (DragNUWA - ProjectNUWA) · (huggingface)
-
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning,
arXiv, 2307.04725
, arxiv, pdf, cication: 60Yuwei Guo, Ceyuan Yang, Anyi Rao, Yaohui Wang, Yu Qiao, Dahua Lin, Bo Dai · (AnimateDiff - guoyww)
-
Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos,
arXiv, 2304.01186
, arxiv, pdf, cication: 31Yue Ma, Yingqing He, Xiaodong Cun, Xintao Wang, Siran Chen, Ying Shan, Xiu Li, Qifeng Chen · (FollowYourPose - mayuelala)
-
ReVideo: Remake a Video with Motion and Content Control,
arXiv, 2405.13865
, arxiv, pdf, cication: -1Chong Mou, Mingdeng Cao, Xintao Wang, Zhaoyang Zhang, Ying Shan, Jian Zhang · (mc-e.github)
-
AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks,
arXiv, 2403.14468
, arxiv, pdf, cication: -1Max Ku, Cong Wei, Weiming Ren, Huan Yang, Wenhu Chen
· (tiger-ai-lab.github) · (AnyV2V - TIGER-AI-Lab) · (huggingface)
-
Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation,
arXiv, 2403.13745
, arxiv, pdf, cication: -1Fu-Yun Wang, Xiaoshi Wu, Zhaoyang Huang, Xiaoyu Shi, Dazhong Shen, Guanglu Song, Yu Liu, Hongsheng Li
-
FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation,
arXiv, 2403.12962
, arxiv, pdf, cication: -1Shuai Yang, Yifan Zhou, Ziwei Liu, Chen Change Loy
· (fresco - williamyang1991)
-
Video Editing via Factorized Diffusion Distillation,
arXiv, 2403.09334
, arxiv, pdf, cication: -1Uriel Singer, Amit Zohar, Yuval Kirstain, Shelly Sheynin, Adam Polyak, Devi Parikh, Yaniv Taigman
-
FastVideoEdit: Leveraging Consistency Models for Efficient Text-to-Video Editing,
arXiv, 2403.06269
, arxiv, pdf, cication: -1Youyuan Zhang, Xuan Ju, James J. Clark
-
Anything in Any Scene: Photorealistic Video Object Insertion,
arXiv, 2401.17509
, arxiv, pdf, cication: -1Chen Bai, Zeman Shao, Guoxiang Zhang, Di Liang, Jie Yang, Zhuorui Zhang, Yujian Guo, Chengzhang Zhong, Yiqiao Qiu, Zhendong Wang · (anythinginanyscene.github)
-
ActAnywhere: Subject-Aware Video Background Generation,
arXiv, 2401.10822
, arxiv, pdf, cication: -1Boxiao Pan, Zhan Xu, Chun-Hao Paul Huang, Krishna Kumar Singh, Yang Zhou, Leonidas J. Guibas, Jimei Yang · (actanywhere.github)
-
Object-Centric Diffusion for Efficient Video Editing,
arXiv, 2401.05735
, arxiv, pdf, cication: -1Kumara Kahatapitiya, Adil Karjauv, Davide Abati, Fatih Porikli, Yuki M. Asano, Amirhossein Habibian
-
FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis,
arXiv, 2312.17681
, arxiv, pdf, cication: -1Feng Liang, Bichen Wu, Jialiang Wang, Licheng Yu, Kunpeng Li, Yinan Zhao, Ishan Misra, Jia-Bin Huang, Peizhao Zhang, Peter Vajda · (jeff-liangf.github)
-
Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis,
arXiv, 2312.13834
, arxiv, pdf, cication: -1Bichen Wu, Ching-Yao Chuang, Xiaoyan Wang, Yichen Jia, Kapil Krishnakumar, Tong Xiao, Feng Liang, Licheng Yu, Peter Vajda
-
MaskINT: Video Editing via Interpolative Non-autoregressive Masked Transformers,
arXiv, 2312.12468
, arxiv, pdf, cication: -1Haoyu Ma, Shahin Mahdizadehaghdam, Bichen Wu, Zhipeng Fan, Yuchao Gu, Wenliang Zhao, Lior Shapira, Xiaohui Xie
-
VidToMe: Video Token Merging for Zero-Shot Video Editing,
arXiv, 2312.10656
, arxiv, pdf, cication: -1Xirui Li, Chao Ma, Xiaokang Yang, Ming-Hsuan Yang
-
RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models,
arXiv, 2312.04524
, arxiv, pdf, cication: -1Ozgur Kara, Bariscan Kurtkaya, Hidir Yesiltepe, James M. Rehg, Pinar Yanardag · (RAVE - rehg-lab)
-
MagicStick: Controllable Video Editing via Control Handle Transformations,
arXiv, 2312.03047
, arxiv, pdf, cication: 1Yue Ma, Xiaodong Cun, Yingqing He, Chenyang Qi, Xintao Wang, Ying Shan, Xiu Li, Qifeng Chen
-
DragVideo: Interactive Drag-style Video Editing,
arXiv, 2312.02216
, arxiv, pdf, cication: -1Yufan Deng, Ruida Wang, Yuhao Zhang, Yu-Wing Tai, Chi-Keung Tang
-
Generative Rendering: Controllable 4D-Guided Video Generation with 2D Diffusion Models,
arXiv, 2312.01409
, arxiv, pdf, cication: -1Shengqu Cai, Duygu Ceylan, Matheus Gadelha, Chun-Hao Paul Huang, Tuanfeng Yang Wang, Gordon Wetzstein
-
VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence,
arXiv, 2312.02087
, arxiv, pdf, cication: 1Yuchao Gu, Yipin Zhou, Bichen Wu, Licheng Yu, Jia-Wei Liu, Rui Zhao, Jay Zhangjie Wu, David Junhao Zhang, Mike Zheng Shou, Kevin Tang
-
Sketch Video Synthesis,
arXiv, 2311.15306
, arxiv, pdf, cication: -1Yudian Zheng, Xiaodong Cun, Menghan Xia, Chi-Man Pun · (sketchvideo - yudianzheng)
-
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation,
arXiv, 2306.07954
, arxiv, pdf, cication: 48Shuai Yang, Yifan Zhou, Ziwei Liu, Chen Change Loy · (Rerender_A_Video - williamyang1991)
-
DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark,
arXiv, 2405.19707
, arxiv, pdf, cication: -1Haoxing Chen, Yan Hong, Zizheng Huang, Zhuoer Xu, Zhangxuan Gu, Yaohui Li, Jun Lan, Huijia Zhu, Jianfu Zhang, Weiqiang Wang · (DeMamba - chenhaoxing)
-
VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models,
arXiv, 2403.06098
, arxiv, pdf, cication: -1Wenhao Wang, Yi Yang
-
VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models,
arXiv, 2403.06098
, arxiv, pdf, cication: -1Wenhao Wang, Yi Yang
-
VBench: Comprehensive Benchmark Suite for Video Generative Models,
arXiv, 2311.17982
, arxiv, pdf, cication: -1Ziqi Huang, Yinan He, Jiashuo Yu, Fan Zhang, Chenyang Si, Yuming Jiang, Yuanhan Zhang, Tianxing Wu, Qingyang Jin, Nattapol Chanpaisit · (VBench - Vchitect)
-
SoraWebui - SoraWebui
SoraWebui is an open-source Sora web client, enabling users to easily create videos from text with OpenAI's Sora model.
-
VGen - ali-vilab
Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models
-
MoneyPrinterTurbo - harry0703
利用大模型,一键生成短视频