Use of semi-synthetic data for catheter segmentation improvement

In the era of data-driven machine learning algorithms, data is the new oil. For the most optimal results, datasets should be large, heterogeneous and, crucially, correctly labeled. However, data collection and labeling are time-consuming and labor-intensive processes. In the field of medical device...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computerized medical imaging and graphics 2023-06, Vol.106, p.102188-102188, Article 102188
Hauptverfasser: Danilov, Viacheslav V., Kolpashchikov, Dmitrii Yu, Gerget, Olga M., Laptev, Nikita V., Proutski, Alex, Hernández Gómez, Luis A., Alvarez, Federico, Ledesma-Carbayo, María J.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In the era of data-driven machine learning algorithms, data is the new oil. For the most optimal results, datasets should be large, heterogeneous and, crucially, correctly labeled. However, data collection and labeling are time-consuming and labor-intensive processes. In the field of medical device segmentation, present during minimally invasive surgery, this leads to a lack of informative data. Motivated by this drawback, we developed an algorithm generating semi-synthetic images based on real ones. The concept of this algorithm is to place a randomly shaped catheter in an empty heart cavity, where the shape of the catheter is generated by forward kinematics of continuum robots. Having implemented the proposed algorithm, we generated new images of heart cavities with various artificial catheters. We compared the results of deep neural networks trained purely on real datasets with respect to networks trained on both real and semi-synthetic datasets, highlighting that semi-synthetic data improves catheter segmentation accuracy. A modified U-Net trained on combined datasets performed the segmentation with a Dice similarity coefficient of 92.6 ± 2.2%, while the same model trained only on real images achieved a Dice similarity coefficient of 86.5 ± 3.6%. Therefore, using semi-synthetic data allows for the decrease of accuracy spread, improves model generalization, reduces subjectivity, shortens the labeling routine, increases the number of samples, and improves the heterogeneity. •This study presents an algorithm to generate medical devices in echocardiography.•The solution significantly reduces subjectivity and shortens the labeling routine.•Combining real and semi-synthetic data improves network’s generalization ability.•Neural networks trained on combined datasets show superior performance.
ISSN:0895-6111
1879-0771
DOI:10.1016/j.compmedimag.2023.102188