✍️Quick Start

MM-LLMs We build a websit for the latest advances in MM-LLMs. We aim to support researchers in MM-LLMs with our work, and we encourage everyone to contribute to the website by adding the latest...

Jan 1, 2001 tutorials

DeepSeek-VL2

Paper: DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding Project Link Publisher: Arxiv Author Affiliation: DeepSeek-AI Functional Division ...

Dec 13, 2024 Arxiv

Apollo

Paper: Apollo: An Exploration of Video Understanding in Large Multimodal Models Project Link Publisher: Arxiv Author Affiliation: Meta GenAI Functional Division Understanding ...

Dec 13, 2024 Arxiv

StreamChat

Paper: StreamChat: Chatting with Streaming Video Project Link Publisher: Arxiv Author Affiliation: CUHK MMLab Functional Division Understanding Generation Desi...

Dec 11, 2024 Arxiv

LinVT

Paper: LinVT: Empower Your Image-level Large Language Model to Understand Videos Project Link Publisher: Arxiv Author Affiliation: Meituan Functional Division Understanding ...

Dec 11, 2024 Arxiv

CompCap

Paper: CompCap: Improving Multimodal Large Language Models with Composite Captions Publisher: Arxiv Author Affiliation: Meta Functional Division Understanding Generation ...

Dec 6, 2024 Arxiv

T2Vid

Paper: T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs Project Link Publisher: Arxiv Author Affiliation: USTC Functional Division Understanding ...

Dec 2, 2024 Arxiv

LongVU

Paper: LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding Project Link Publisher: Arxiv Author Affiliation: Meta AI Functional Division Understa...

Oct 22, 2024 Arxiv

AURORACAP

Paper: AURORACAP: EFFICIENT, PERFORMANT VIDEO DETAILED CAPTIONING AND A NEW BENCHMARK Project Link Publisher: Arxiv Author Affiliation: University of Washington Functional Division ...

Oct 4, 2024 Arxiv

OMG-LLaVA

Paper: OMG-LLaVA: Bridging Image-level,Object-level, Pixel-level Reasoning and Understanding Project Link Publisher: NeurIPS 2024 Author Affiliation: Wuhan University Functional Division...

Oct 1, 2024 NeurIPS 2024