MMMU
Paper: MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI Project Link Publisher: Arxiv Author Affiliation: IN.AI Research
Paper: MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI Project Link Publisher: Arxiv Author Affiliation: IN.AI Research
Paper: GeoChat: Grounded Large Vision-Language Model for Remote Sensing Project Link Publisher: CVPR 2024 Author Affiliation: Mohamed bin Zayed University of AI
Paper: Multimodal Large Language Models: A Survey Project Link: None Publisher: BigData 2023 Author Affiliation: Jinan University
Paper: A Survey on Multimodal Large Language Models for Autonomous Driving Project Link Publisher: WACV 2024 Author Affiliation: Purdue University
Paper: ShareGPT4V: Improving Large Multi-Modal Models with Better Captions GitHub Link Publisher: Arxiv Author Affiliation: University of Science and Technology of China Functional Divis...
Paper: ShareGPT4V: Improving Large Multi-Modal Models with Better Captions GitHub Link Publisher: Arxiv Author Affiliation: University of Science and Technology of China Type ...
Paper: LION : Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge GitHub Link Publisher: Arxiv Author Affiliation: Harbin Institute of Technology Functional Divis...
Paper: DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding GitHub Link: None Publisher: Arxiv Author Affiliation: Universi...
Paper: DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback GitHub Link Publisher: Arxiv Author Affiliation: SRI International ...
Paper: DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback GitHub Link Publisher: Arxiv Author Affiliation: SRI International ...