Development of an initial training and evaluation programme for manual lower limb muscle MRI segmentation

Background Magnetic resonance imaging (MRI) quantification of intramuscular fat accumulation is a responsive biomarker in neuromuscular diseases. Despite emergence of automated methods, manual muscle segmentation remains an essential foundation. We aimed to develop a training programme for new obser...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	European Radiology Experimental 2024-07, Vol.8 (1), p.85-12, Article 85
Hauptverfasser:	Morrow, Jasper M., Shah, Sachit, Cristiano, Lara, Evans, Matthew R. B., Doherty, Carolynne M., Alnaemi, Talal, Saab, Abeer, Emira, Ahmed, Klickovic, Uros, Hammam, Ahmed, Altuwaijri, Afnan, Wastling, Stephen, Reilly, Mary M., Hanna, Michael G., Yousry, Tarek A., Thornton, John S.
Format:	Artikel
Sprache:	eng
Schlagworte:	Automation Benchmarking Benchmarks Biomarkers Diagnostic Radiology Imaging Internal Medicine Interventional Radiology Machine learning Magnetic resonance imaging Medical research Medicine Medicine & Public Health Medicine, Experimental Muscle (skeletal) Neuromuscular diseases Neuroradiology Original Article Radiology Thigh Ultrasound
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Background Magnetic resonance imaging (MRI) quantification of intramuscular fat accumulation is a responsive biomarker in neuromuscular diseases. Despite emergence of automated methods, manual muscle segmentation remains an essential foundation. We aimed to develop a training programme for new observers to demonstrate competence in lower limb muscle segmentation and establish reliability benchmarks for future human observers and machine learning segmentation packages. Methods The learning phase of the training programme comprised a training manual, direct instruction, and eight lower limb MRI scans with reference standard large and small regions of interest (ROIs). The assessment phase used test–retest scans from two patients and two healthy controls. Interscan and interobserver reliability metrics were calculated to identify underperforming outliers and to determine competency benchmarks. Results Three experienced observers undertook the assessment phase, whilst eight new observers completed the full training programme. Two of the new observers were identified as underperforming outliers, relating to variation in size or consistency of segmentations; six had interscan and interobserver reliability equivalent to those of experienced observers. The calculated benchmark for the Sørensen-Dice similarity coefficient between observers was greater than 0.87 and 0.92 for individual thigh and calf muscles, respectively. Interscan and interobserver reliability were significantly higher for large than small ROIs (all p
ISSN:	2509-9280 2509-9280
DOI:	10.1186/s41747-024-00475-9