ViperGPT
Paper: ViperGPT: Visual Inference via Python Execution for Reasoni GitHub Link Publisher: Arxiv Author Affiliation: Columbia University Functional Division Understanding ...
Paper: ViperGPT: Visual Inference via Python Execution for Reasoni GitHub Link Publisher: Arxiv Author Affiliation: Columbia University Functional Division Understanding ...
Paper: Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models GitHub Link Publisher: Arxiv Author Affiliation: Microsoft Research Asia Functional Division ...
Paper: PaLM-E: An Embodied Multimodal Language Model GitHub Link Publisher: Arxiv Author Affiliation: Google Functional Division Understanding Generation Desig...
Paper: Language Is Not All You Need: Aligning Perception with Language Models GitHub Link Publisher: Arxiv Author Affiliation: Microsoft Functional Division Understanding ...
Paper: Grounding Language Models to Images for Multimodal Inputs and Outputs GitHub Link Publisher: ICML 2023 Author Affiliation: Carnegie Mellon University Functional Division ...
Paper: BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models GitHub Link Publisher: ICML 2023 Author Affiliation: Salesforce Research Fun...
Paper: Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering Project Link Publisher: NeurIPS 2022 Author Affiliation: University of California, Los Angel...
Paper: IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning Project Link Publisher: NeurIPS 2021 Author Affiliation: UCLA
Paper: A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge Project Link Publisher: Arxiv Author Affiliation: PRIOR @ Allen Institute for AI
Paper: 🦩Flamingo: a Visual Language Model for Few-Shot Learning GitHub Link: None Publisher: NeurIPS 2022 Author Affiliation: DeepMind Functional Division Understanding ...