DRESS’s IT

Paper: DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback GitHub Link Publisher: Arxiv Author Affiliation: SRI International ...

Nov 16, 2023 Arxiv

Qwen-Audio

Paper: Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models GitHub Link Publisher: Arxiv Author Affiliation: Alibaba Group Functional Divisio...

Nov 14, 2023 Arxiv

Volcano

Paper: Volcano: Mitigating Multimodal Hallucination through Self-Feedback Guided Revision GitHub Link Publisher: Arxiv Author Affiliation: Korea University Functional Division ...

Nov 13, 2023 Arxiv

Monkey

Paper: Image Resolution and Text Label Are Important Things for Large Multi-modal Models GitHub Link Publisher: Arxiv Author Affiliation: Huazhong University of Science and Technology Fu...

Nov 11, 2023 Arxiv

Related Survey 4

Paper: How to Bridge the Gap between Modalities: A Comprehensive Survey on Multimodal Large Language Model Project Link: None Publisher: Arxiv Author Affiliation: Hefei University of Techn...

Nov 10, 2023 Arxiv

LLaVA-Plus

Paper: LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents GitHub Link Publisher: Arxiv Author Affiliation: Tsinghua University Functional Division Understanding ...

Nov 9, 2023 Arxiv

TEAL

Paper: TEAL: Tokenize and Embed ALL for Multi-modal Large Language Models GitHub Link: None Publisher: Arxiv Author Affiliation: Tencent Functional Division Understanding ...

Nov 8, 2023 Arxiv

mPLUG-Owl2

Paper: mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration GitHub Link Publisher: Arxiv Author Affiliation: Alibaba Group Functional Division ...

Nov 7, 2023 Arxiv

GLaMM

Paper: GLaMM: Pixel Grounding Large Multimodal Model GitHub Link Publisher: Arxiv Author Affiliation: Mohamed bin Zayed University of AI Functional Division Understanding ...

Nov 6, 2023 Arxiv

CogVLM

Paper: CogVLM: Visual Expert for Pretrained Language Models GitHub Link Publisher: Arxiv Author Affiliation: Tsinghua University Functional Division Understanding Genera...