Potential of Multimodal Large Language Models for Data Mining of Medical Images and Free-text Reports
Medical images and radiology reports are crucial for diagnosing medical conditions, highlighting the importance of quantitative analysis for clinical decision-making. However, the diversity and cross-source heterogeneity of these data challenge the generalizability of current data-mining methods. Mu...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Medical images and radiology reports are crucial for diagnosing medical
conditions, highlighting the importance of quantitative analysis for clinical
decision-making. However, the diversity and cross-source heterogeneity of these
data challenge the generalizability of current data-mining methods. Multimodal
large language models (MLLMs) have recently transformed many domains,
significantly affecting the medical field. Notably, Gemini-Vision-series
(Gemini) and GPT-4-series (GPT-4) models have epitomized a paradigm shift in
Artificial General Intelligence (AGI) for computer vision, showcasing their
potential in the biomedical domain. In this study, we evaluated the performance
of the Gemini, GPT-4, and 4 popular large models for an exhaustive evaluation
across 14 medical imaging datasets, including 5 medical imaging categories
(dermatology, radiology, dentistry, ophthalmology, and endoscopy), and 3
radiology report datasets. The investigated tasks encompass disease
classification, lesion segmentation, anatomical localization, disease
diagnosis, report generation, and lesion detection. Our experimental results
demonstrated that Gemini-series models excelled in report generation and lesion
detection but faces challenges in disease classification and anatomical
localization. Conversely, GPT-series models exhibited proficiency in lesion
segmentation and anatomical localization but encountered difficulties in
disease diagnosis and lesion detection. Additionally, both the Gemini series
and GPT series contain models that have demonstrated commendable generation
efficiency. While both models hold promise in reducing physician workload,
alleviating pressure on limited healthcare resources, and fostering
collaboration between clinical practitioners and artificial intelligence
technologies, substantial enhancements and comprehensive validations remain
imperative before clinical deployment. |
---|---|
DOI: | 10.48550/arxiv.2407.05758 |