Arxiv 153
- Cobra
- VL-Mamba
- KEBench
- DeepSeek-VL
- ASMv2
- CogBench
- AnyGPT
- Any-to-any Multimodal Instruction Dataset
- VisLingInstruct
- ViGoR
- MMViG
- SPHINX-X
- MobileVLM V2
- CogCoM
- Video-LaVIT
- VLGuard
- VLGuard’s IT
- Mobile-Agent
- MoE-LLaVA
- LLaVA-MoLE
- InternLM-XComposer2
- WebVoyager
- Vary-toy
- RTVLM
- RPG
- CMMMU
- MLLM-Tool
- SkyEyeGPT
- MM-Interleaved
- DiffusionGPT
- α-UMi
- ModaVerse
- GroundingGPT
- 3DMIT
- GOAT-Bench
- DocLLM
- TinyGPT-V
- MobileVLM
- Related Survey 3
- V*
- InternVL
- Emu-2
- Gemini
- CLOVA
- Osprey
- Osprey’s IT
- VL-GPT
- CogAgent
- VILA
- MP5
- Lyrics
- VLFeedback
- Silkie
- MME Perception
- MME Cognition
- BenchLMM
- PixelLM
- RLHF-V
- RLHF-V’s IT
- Dolphins
- mPLUG-PaperOwl
- X-InstructBLIP
- X-InstructBLIP’s IT
- CoDi-2
- VIM
- LLaMA-VID
- MMMU
- ShareGPT4V
- ShareGPT4V’s IT
- LION
- DocPedia
- DRESS
- DRESS’s IT
- Qwen-Audio
- Volcano
- Monkey
- Related Survey 4
- LLaVA-Plus
- TEAL
- mPLUG-Owl2
- GLaMM
- CogVLM
- ControlLLM
- MM-Vet
- SALMONN
- MiniGPT-v2
- LLaVA-1.5
- Kosmos-G
- MiniGPT-5
- JAM
- AnyMAL
- QBench
- LLaVA-RLHF
- Kosmos-2.5
- InternLM-XComposer
- CM3Leon
- PointLLM
- Qwen-VL
- VisCPM
- StableLLaVA
- Chat-3D
- MMBenchmark
- MMBench-Chinese
- SparklesChat
- SparklesChat's IT
- ASM
- OpenFlamingo
- LISA
- 3D-LLM
- MGVLID
- ChatSpot
- BuboGPT
- BuboGPT’s IT
- SEED
- SEED-Bench (Image)
- SVIT
- GPT4RoI
- Lynx
- mPLUG-DocOwl
- mPLUG-DocOwl’s IT
- LLaVAR’s IT
- LLaVAR
- Shikra
- Kosmos-2
- Related Survey 1
- AudioPaLM
- Video-ChatGPT
- Video-ChatGPT’s IT
- MIMIC-IT
- M3IT
- LLaVA-Med
- PaLI-X
- PandaGPT
- PandaGPT’s IT
- DetGPT
- VideoChat
- VideoChat’s IT
- MultiModal-GPT
- X-LLM
- Otter
- mPLUG-Owl
- AudioGPT
- MiniGPT-4
- MiniGPT-4’s IT
- HuggingGPT
- VSR
- MM-REACT
- GPT-4
- ViperGPT
- Visual ChatGPT
- PaLM-E
- Kosmos-1
- OKVQA