ControlLLM
Paper: ControlLLM: Augment Language Models with Tools by Searching on Graphs GitHub Link Publisher: Arxiv Author Affiliation: The Hong Kong University of Science and Technolo Functional ...
Paper: ControlLLM: Augment Language Models with Tools by Searching on Graphs GitHub Link Publisher: Arxiv Author Affiliation: The Hong Kong University of Science and Technolo Functional ...
Paper: MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities Project Link Publisher: Arxiv Author Affiliation: National University of Singapore
Paper: SALMONN: Towards Generic Hearing Abilities for Large Language Models GitHub Link Publisher: Arxiv Author Affiliation: Tsinghua University Functional Division Understand...
Paper: Fuyu-8B: A Multimodal Architecture for AI Agents GitHub Link: None Publisher: Website Author Affiliation: ADEPT Functional Division Understanding Generation ...
Paper: MINIGPT-V2: LARGE LANGUAGE MODEL AS A UNIFIED INTERFACE FOR VISION-LANGUAGE MULTITASK LEARNING GitHub Link Publisher: Arxiv Author Affiliation: King Abdullah University of Science a...
Paper: Improved baselines with visual instruction tuning GitHub Link Publisher: Arxiv Author Affiliation: University of Wisconsin–Madison Functional Division Understanding ...
Paper: Kosmos-G: Generating Images in Context with Multimodal Large Language Models GitHub Link Publisher: Arxiv Author Affiliation: Microsoft Research Functional Division Und...
Paper: MINIGPT-5: INTERLEAVED VISION-AND-LANGUAGE GENERATION VIA GENERATIVE VOKENS GitHub Link Publisher: Arxiv Author Affiliation: University of California, Santa Cruz Functional Divisi...
Paper: MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts Project Link Publisher: ICLR 2024 Author Affiliation: UCLA
Paper: LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment GitHub Link Publisher: ICLR 2024 Author Affiliation: Peking University Functi...