Healthcare Cramér Generative Adversarial Network (HCGAN)

Medical data is shared with a wide range for various research purposes and an extensive amount of research has been developed in the data privacy community for anonymization. Unfortunately, Data anonymization techniques do not provide data privacy guarantees and synthetic data generation is an alter...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Distributed and parallel databases : an international journal 2022-12, Vol.40 (4), p.657-673
Hauptverfasser: Indhumathi, R., Devi, S. Sathiya
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Medical data is shared with a wide range for various research purposes and an extensive amount of research has been developed in the data privacy community for anonymization. Unfortunately, Data anonymization techniques do not provide data privacy guarantees and synthetic data generation is an alternative approach in data anonymization. Deep learning has recently achieved more reputation for its high accuracy and privacy concern. Nowadays, deep learning is extensively applied in the medical field for classification, segmentation and privacy-preserving. Using Deep learning, synthetic data can be generated to improve the privacy of the original medical data and also to prevent attacks. Deep learning models capture the relationship between multiple features in medical data. In this research, Healthcare Cramér Generative Adversarial Network (HCGAN) is proposed, where (i) the Quasi Identifiers (QI) are identified in medical data and separated as QI attributes and the remaining attributes are considered as Sensitive Attributes (SA) (ii) f –differential privacy anonymization technique is applied only to the identified QI and the final result is combined with the SA attribute (iii) The anonymized medical data is used as real data for training Cramér Generative Adversarial Network (GAN) where Cramér distance is used to improve the efficiency of the model. (iv) Finally, Privacy is checked by overcoming the attacks. The result shows that the HCGAN method effectively prevents attacks during the training and testing phase compared to Wasserstein GAN. The result demonstrates that health care GAN generates synthetic data that can provide high privacy and overcome various attacks.
ISSN:0926-8782
1573-7578
DOI:10.1007/s10619-021-07346-x