#

multimodal-llm

Here are 9 public repositories matching this topic...

eric-ai-lab / MiniGPT-5

Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"

transformers diffusion-models multimodal-generation multimodal-llm

Updated Mar 19, 2024
Python

Zhoues / MineDreamer

This repo is the official implementation of "MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control "

minecraft diffusion-model embodied-agent multimodal-llm

Updated May 3, 2024
Python

UCSC-VLAA / vllm-safety-benchmark

Official PyTorch Implementation of "How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs"

benchmark safety datasets robustness adversarial-attacks llm vision-language-model multimodal-llm

Updated Nov 28, 2023
Python

alipay / Ant-Multi-Modal-Framework

Research Code for Multimodal-Cognition Team in Ant Group

video-editing multimodal-learning video-text-retrieval image-text-retrieval multimodal-llm

Updated May 8, 2024
Python

zhudotexe / kani-vision

Kani extension for supporting vision-language models (VLMs). Comes with model-agnostic support for GPT-Vision and LLaVA.

extension kani large-language-models vision-language-model llava multimodal-llm gpt-vision

Updated Nov 22, 2023
Python

autodistill / autodistill-llava

LLaVA base model for use with Autodistill.

computer-vision llava autodistill multimodal-llm

Updated Jan 24, 2024
Python

shanface33 / GPT4MF_UB

Official repository of the paper: Can ChatGPT Detect DeepFakes? A Study of Using Multimodal Large Language Models for Media Forensics

image-forensics deepfake-detection deepfake-images chatgpt-4 multimodal-llm ai-generated-image-detection

Updated Mar 22, 2024

HenryPengZou / ImplicitAVE

[ACL ARR Under Review] Dataset and Code of "ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction"

attribute-value-extraction vision-language-model multimodal-llm implicit-attribute-value-extraction

Updated May 20, 2024
Jupyter Notebook

iamaziz / chat_with_images

Streamlit app to chat with images using Multi-modal LLMs.

streamlit llms llava multimodal-llm

Updated Mar 17, 2024
Python

Improve this page

Add a description, image, and links to the multimodal-llm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodal-llm topic, visit your repo's landing page and select "manage topics."