SinGAN-Seg: Synthetic training data generation for medical image segmentation

Analyzing medical data to find abnormalities is a time-consuming and costly task, particularly for rare abnormalities, requiring tremendous efforts from medical experts. Therefore, artificial intelligence has become a popular tool for the automatic processing of medical data, acting as a supportive...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	PloS one 2022-05, Vol.17 (5), p.e0267976-e0267976
Hauptverfasser:	Thambawita, Vajira, Salehi, Pegah, Sheshkal, Sajad Amouei, Hicks, Steven A, Hammer, Hugo L, Parasa, Sravanthi, Lange, Thomas de, Halvorsen, Pål, Riegler, Michael A
Format:	Artikel
Sprache:	eng
Schlagworte:	Abnormalities Algorithms Analysis Annotations Artificial Intelligence Classification Computer and Information Science Computer and Information Sciences Cost analysis Data- och informationsvetenskap Datasets Deep Learning Endoscopy General Data Protection Regulation Generative adversarial networks Image processing Image Processing, Computer-Assisted - methods Image segmentation Information management Machine learning Masks Medical advice systems Medical imaging Medical imaging equipment Medicine Medicine and Health Sciences Modelling Neural networks Neural Networks, Computer Personal information Physical Sciences Physicians Polyps Privacy Qualitative analysis Research and Analysis Methods Training
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Analyzing medical data to find abnormalities is a time-consuming and costly task, particularly for rare abnormalities, requiring tremendous efforts from medical experts. Therefore, artificial intelligence has become a popular tool for the automatic processing of medical data, acting as a supportive tool for doctors. However, the machine learning models used to build these tools are highly dependent on the data used to train them. Large amounts of data can be difficult to obtain in medicine due to privacy reasons, expensive and time-consuming annotations, and a general lack of data samples for infrequent lesions. In this study, we present a novel synthetic data generation pipeline, called SinGAN-Seg, to produce synthetic medical images with corresponding masks using a single training image. Our method is different from the traditional generative adversarial networks (GANs) because our model needs only a single image and the corresponding ground truth to train. We also show that the synthetic data generation pipeline can be used to produce alternative artificial segmentation datasets with corresponding ground truth masks when real datasets are not allowed to share. The pipeline is evaluated using qualitative and quantitative comparisons between real data and synthetic data to show that the style transfer technique used in our pipeline significantly improves the quality of the generated data and our method is better than other state-of-the-art GANs to prepare synthetic images when the size of training datasets are limited. By training UNet++ using both real data and the synthetic data generated from the SinGAN-Seg pipeline, we show that the models trained on synthetic data have very close performances to those trained on real data when both datasets have a considerable amount of training data. In contrast, we show that synthetic data generated from the SinGAN-Seg pipeline improves the performance of segmentation models when training datasets do not have a considerable amount of data. All experiments were performed using an open dataset and the code is publicly available on GitHub.
ISSN:	1932-6203 1932-6203
DOI:	10.1371/journal.pone.0267976