Screening Patient Misidentification Errors Using a Deep Learning Model of Chest Radiography: A Seven Reader Study

We aimed to evaluate the ability of deep learning (DL) models to identify patients from a paired chest radiograph (CXR) and compare their performance with that of human experts. In this retrospective study, patient identification DL models were developed using 240,004 CXRs. The models were validated...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of imaging informatics in medicine 2024-09
Hauptverfasser: Kim, Kiduk, Cho, Kyungjin, Eo, Yujeong, Kim, Jeeyoung, Yun, Jihye, Ahn, Yura, Seo, Joon Beom, Hong, Gil-Sun, Kim, Namkug
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We aimed to evaluate the ability of deep learning (DL) models to identify patients from a paired chest radiograph (CXR) and compare their performance with that of human experts. In this retrospective study, patient identification DL models were developed using 240,004 CXRs. The models were validated using multiple datasets, namely, internal validation, CheXpert, and Chest ImaGenome (CIG), which include different populations. Model performance was analyzed in terms of disease change status. The performance of the models to identify patients from paired CXRs was compared with three junior radiology residents (group I), two senior radiology residents (group II), and two board-certified expert radiologists (group III). For the reader study, 240 patients (age, 56.617 ± 13.690 years, 113 females, 160 same pairs) were evaluated. A one-sided non-inferiority test was performed with a one-sided margin of 0.05. SimChest, our similarity-based DL model, demonstrated the best patient identification performance across multiple datasets, regardless of disease change status (internal validation [area under the receiver operating characteristic curve range: 0.992-0.999], CheXpert [0.933-0.948], and CIG [0.949-0.951]). The radiologists identified patients from the paired CXRs with a mean accuracy of 0.900 (95% confidence interval: 0.852-0.948), with performance increasing with experience (mean accuracy:group I [0.874], group II [0.904], group III [0.935], and SimChest [0.904]). SimChest achieved non-inferior performance compared to the radiologists (P for non-inferiority: 0.015). The findings of this diagnostic study indicate that DL models can screen for patient misidentification using a pair of CXRs non-inferiorly to human experts.
ISSN:2948-2933
2948-2933
DOI:10.1007/s10278-024-01245-0