✍️Quick Start
MM-LLMs We build a websit for the latest advances in MM-LLMs. We aim to support researchers in MM-LLMs with our work, and we encourage everyone to contribute to the website by adding the latest...
MM-LLMs We build a websit for the latest advances in MM-LLMs. We aim to support researchers in MM-LLMs with our work, and we encourage everyone to contribute to the website by adding the latest...
Paper: Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference GitHub Link Publisher: Arxiv Author Affiliation: Westlake University Functional Division ...
Paper: VL-Mamba: Exploring State Space Models for Multimodal Learning GitHub Link Publisher: Arxiv Author Affiliation: The University of Adelaide Functional Division Understan...
Paper: KEBench: A Benchmark on Knowledge Editing for Large Vision-Language Models Publisher: Arxiv Author Affiliation: University of Chinese Academy of Sciences
Paper: DeepSeek-VL: Towards Real-World Vision-Language Understanding GitHub Link Publisher: Arxiv Author Affiliation: DeepSeek-AI Functional Division Understanding Gener...
Paper: The All-Seeing Project V2 Towards General Relation Comprehension of the Open World GitHub Link Publisher: Arxiv Author Affiliation: Shanghai AI Laboratory Functional Division ...
Paper: A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision Language Models Project Link Publisher: Arxiv Author Affiliation: Shanghai Jiao Tong University
Paper: AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling GitHub Link Publisher: Arxiv Author Affiliation: Fudan University Functional Division Understanding ...
Paper: AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling GitHub Link Publisher: Arxiv Author Affiliation: Fudan University Multi-turn ✔ ✖ Input Mo...
Paper: VisLingInstruct: Elevating Zero-Shot Learning in Multi-Modal Language Models with Autonomous Instruction Optimization GitHub Link Publisher: Arxiv Author Affiliation: Baidu Inc. F...