🎞 VLog: Video as a Long Document

Given a long video, we turn it into a doc containing visual + audio info. By sending this doc to ChatGPT, we can chat over the video!

News

23/April/2023: We release Huggingface gradio demo!
20/April/2023: We release our project on github and local gradio demo!

To Do List

Done

LLM Reasoner: ChatGPT (multilingual) + LangChain
Vision Captioner: BLIP2 + GRIT
ASR Translator: Whisper (multilingual)
Video Segmenter: KTS
Huggingface Space

Doing

Optimize the codebase efficiency
Improve Vision Models: MiniGPT-4 / LLaVA, Family of Segment-anything
Improve ASR Translator for better alignment
Introduce Temporal dependency
Replace ChatGPT with own trained LLM

🧸 Examples

[ News - GPT4 launch event ]

[ TV series - 征服之华强买瓜 ]

[ TV series - The Big Bang Theory ]

[ Travel video - Travel in Rome ]

[ Vlog - Basketball training ]

🔨 Preparation

Please find installation instructions in install.md.

🌟 Start here

Run in cmd

python main.py --video_path examples/buy_watermelon.mp4 --openai_api_key xxxxx

The generated video document will be generated and saved in examples/buy_watermelon.log

Run in Gradio

python main_gradio.py --openai_api_key xxxxx

🙋 Suggestion

Stay tuned for our project 🔥

If you have more suggestions or functions need to be implemented in this codebase, feel free to drop us an email [email protected], [email protected] or open an issue.

😊 Acknowledgment

This work is based on ChatGPT, BLIP2, GRIT, KTS, Whisper, LangChain, Image2Paragraph.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
examples		examples
figures		figures
models		models
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
install.md		install.md
main.py		main.py
main_gradio.py		main_gradio.py
requirements.txt		requirements.txt

License

showlab/VLog

Folders and files

Latest commit

History

Repository files navigation

🎞 VLog: Video as a Long Document

News

To Do List

🧸 Examples

🔨 Preparation

🌟 Start here

Run in cmd

Run in Gradio

🙋 Suggestion

😊 Acknowledgment

About

Topics

Resources

License

Stars

Watchers

Forks

Languages