From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities

Multi-modal Large Language Models (MLLMs) have shown impressive abilities in generating reasonable responses with respect to multi-modal contents. However, there is still a wide gap between the performance of recent MLLM-based applications and the expectation of the broad public, even though the mos...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-01
Hauptverfasser: Lu, Chaochao, Chen, Qian, Zheng, Guodong, Fan, Hongxing, Gao, Hongzhi, Zhang, Jie, Shao, Jing, Deng, Jingyi, Fu, Jinlan, Huang, Kexin, Li, Kunchang, Li, Lijun, Wang, Limin, Lu, Sheng, Chen, Meiqi, Zhang, Ming, Ren, Qibing, Chen, Sirui, Gui, Tao, Ouyang, Wanli, Wang, Yali, Teng, Yan, Wang, Yaru, Wang, Yi, He, Yinan, Wang, Yingchun, Wang, Yixu, Zhang, Yongting, Yu, Qiao, Shen, Yujiong, Mou, Yurong, Chen, Yuxi, Zhang, Zaibin, Shi, Zhelun, Yin, Zhenfei, Wang, Zhipin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!