MIMIC-IT
Paper: MIMIC-IT: Multi-Modal In-Context Instruction Tuning GitHub Link Publisher: Arxiv Author Affiliation: Nanyang Technological University Type SFT RLHF Mult...
Paper: MIMIC-IT: Multi-Modal In-Context Instruction Tuning GitHub Link Publisher: Arxiv Author Affiliation: Nanyang Technological University Type SFT RLHF Mult...
Paper: M3IT: A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning GitHub Link Publisher: Arxiv Author Affiliation: The University of Hong Kong Type SFT ...
Paper: Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding GitHub Link Publisher: EMNLP 2023 Author Affiliation: Alibaba Group Functional Division ...
Paper: Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding GitHub Link Publisher: EMNLP 2023 Author Affiliation: Alibaba Group Type SFT ...
Paper: LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day GitHub Link Publisher: Arxiv Author Affiliation: Microsoft Functional Division Unde...
Paper: PaLI-X: On scaling up a multilingual vision and language model GitHub Link: None Publisher: Arxiv Author Affiliation: Google Research Functional Division Understanding ...
Paper: Generating Images with Multimodal Language Models GitHub Link Publisher: NeurIPS 2023 Author Affiliation: Carnegie Mellon University Functional Division Understanding ...
Paper: PandaGPT: One Model To Instruction-Follow Them All GitHub Link Publisher: Arxiv Author Affiliation: University of Cambridge Functional Division Understanding Gene...
Paper: PandaGPT: One Model To Instruction-Follow Them All GitHub Link Publisher: Arxiv Author Affiliation: University of Cambridge Type SFT RLHF Multi-turn ...
Paper: EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought GitHub Link Publisher: NeurIPS 2023 Author Affiliation: The University of Hong Kong, Functional Division ...