Generative Pre-trained Transformer 4 makes cardiovascular magnetic resonance reports easy to understand

Patients are increasingly using Generative Pre-trained Transformer 4 (GPT-4) to better understand their own radiology findings. To evaluate the performance of GPT-4 in transforming cardiovascular magnetic resonance (CMR) reports into text that is comprehensible to medical laypersons. ChatGPT with GP...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of cardiovascular magnetic resonance 2024, Vol.26 (1), p.101035, Article 101035
Hauptverfasser:	Salam, Babak, Kravchenko, Dmitrij, Nowak, Sebastian, Sprinkart, Alois M., Weinhold, Leonie, Odenthal, Anna, Mesropyan, Narine, Bischoff, Leon M., Attenberger, Ulrike, Kuetting, Daniel L., Luetkens, Julian A., Isaak, Alexander
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial intelligence Cardiovascular Diseases - diagnostic imaging Cardiovascular magnetic resonance Comprehension Female Generative Pre-trained Transformers Health Literacy Humans Large language models Magnetic Resonance Imaging Male Observer Variation Patient Education as Topic Predictive Value of Tests Reproducibility of Results Text simplification
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Patients are increasingly using Generative Pre-trained Transformer 4 (GPT-4) to better understand their own radiology findings. To evaluate the performance of GPT-4 in transforming cardiovascular magnetic resonance (CMR) reports into text that is comprehensible to medical laypersons. ChatGPT with GPT-4 architecture was used to generate three different explained versions of 20 various CMR reports (n = 60) using the same prompt: “Explain the radiology report in a language understandable to a medical layperson”. Two cardiovascular radiologists evaluated understandability, factual correctness, completeness of relevant findings, and lack of potential harm, while 13 medical laypersons evaluated the understandability of the original and the GPT-4 reports on a Likert scale (1 “strongly disagree”, 5 “strongly agree”). Readability was measured using the Automated Readability Index (ARI). Linear mixed-effects models (values given as median [interquartile range]) and intraclass correlation coefficient (ICC) were used for statistical analysis. GPT-4 reports were generated on average in 52 s ± 13. GPT-4 reports achieved a lower ARI score (10 [9–12] vs 5 [4–6]; p
ISSN:	1097-6647 1532-429X 1532-429X
DOI:	10.1016/j.jocmr.2024.101035