Automated Radiology-Arthroscopy Correlation of Knee Meniscal Tears Using Natural Language Processing Algorithms

Train and apply natural language processing (NLP) algorithms for automated radiology-arthroscopy correlation of meniscal tears. In this retrospective single-institution study, we trained supervised machine learning models (logistic regression, support vector machine, and random forest) to detect med...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Academic radiology 2022-04, Vol.29 (4), p.479-487
Hauptverfasser:	Li, Matthew D., Deng, Francis, Chang, Ken, Kalpathy-Cramer, Jayashree, Huang, Ambrose J.
Format:	Artikel
Sprache:	eng
Schlagworte:	Arthroscopy Humans Knee MRI Machine learning Magnetic Resonance Imaging Meniscal tear Natural Language Processing Radiology Radiology-arthroscopy correlation Retrospective Studies Sensitivity and Specificity Support Vector Machine Tibial Meniscus Injuries - diagnostic imaging
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Train and apply natural language processing (NLP) algorithms for automated radiology-arthroscopy correlation of meniscal tears. In this retrospective single-institution study, we trained supervised machine learning models (logistic regression, support vector machine, and random forest) to detect medial or lateral meniscus tears on free-text MRI reports. We trained and evaluated model performances with cross-validation using 3593 manually annotated knee MRI reports. To assess radiology-arthroscopy correlation, we then randomly partitioned this dataset 80:20 for training and testing, where 108 test set MRIs were followed by knee arthroscopy within 1 year. These free-text arthroscopy reports were also manually annotated. The NLP algorithms trained on the knee MRI training dataset were then evaluated on the MRI and arthroscopy report test datasets. We assessed radiology-arthroscopy agreement using the ensembled NLP-extracted findings versus manually annotated findings. The NLP models showed high cross-validation performance for meniscal tear detection on knee MRI reports (medial meniscus F1 scores 0.93–0.94, lateral meniscus F1 scores 0.86–0.88). When these algorithms were evaluated on arthroscopy reports, despite never training on arthroscopy reports, performance was similar, though higher with model ensembling (medial meniscus F1 score 0.97, lateral meniscus F1 score 0.99). However, ensembling did not improve performance on knee MRI reports. In the radiology-arthroscopy test set, the ensembled NLP models were able to detect mismatches between MRI and arthroscopy reports with sensitivity 79% and specificity 87%. Radiology-arthroscopy correlation can be automated for knee meniscal tears using NLP algorithms, which shows promise for education and quality improvement.
ISSN:	1076-6332 1878-4046
DOI:	10.1016/j.acra.2021.01.017