Inter-rater Reliability and Cost in Pricing for Creating Dataset Focused on Mediolateral Oblique View in Mammography

Purpose: The purpose of this study was to assess inter-rater reliability and workload for creating accurate training data in the clinical evaluation of mammographic positioning for deep learning. Methods: A total of 107 mammographic images without lesions were labeled by two certified radiologic tec...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Japanese Journal of Radiological Technology 2023/11/20, Vol.79(11), pp.1274-1279
Hauptverfasser: Yagahara, Ayako, Aoki, Yousuke, Kabeya, Mayu, Ogawa, Azusa, Tanaka, Yuki, Uesugi, Masahito
Format: Artikel
Sprache:eng ; jpn
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Purpose: The purpose of this study was to assess inter-rater reliability and workload for creating accurate training data in the clinical evaluation of mammographic positioning for deep learning. Methods: A total of 107 mammographic images without lesions were labeled by two certified radiologic technologists in seven items: six clinical image evaluation criteria in positioning and breast tissue density. The kappa coefficient was calculated as an indicator of interrater reliability. Furthermore, the labeling cost per image was calculated based on labeling time and salary for the technologists. Results: The kappa coefficients were 0.71 for inframammary fold, 0.43 for nipple in profile, 0.45 for great pectoral muscle, 0.10 for symmetrical images, and 0.61 for retromammary fat. No significant difference was found in the coefficients of spread of breast tissue. The cost per image was calculated at 11.0 yen. Conclusion: The inter-rater reliability for the inframammary fold, nipple in profile, great pectoral muscle, and retromammary fat ranged from “moderate” to “substantial.” The reliability for symmetrical images was “slight,” indicating the need for a consensus among evaluators during labeling. The labeling cost was equivalent to or higher than that of existing services.
ISSN:0369-4305
1881-4883
DOI:10.6009/jjrt.2023-1418