Constraints-Relaxed Functional Dependency based Data Privacy Preservation Model

The data privacy preservation technique must ensure data integrity and prevent the invasion of confidential data from unsolicited or unapproved usage by any authorized or unauthorized user. Meanwhile, genuine users can use data for legal purposes. Confidential data should be excluded from data analy...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Engineering letters 2023-02, Vol.31 (1), p.19
Hauptverfasser:	Basapur, Satish B, Shylaja, B S
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Confidentiality Correlation Data analysis Datasets Encryption Heart diseases Income Privacy Public policy User requirements
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The data privacy preservation technique must ensure data integrity and prevent the invasion of confidential data from unsolicited or unapproved usage by any authorized or unauthorized user. Meanwhile, genuine users can use data for legal purposes. Confidential data should be excluded from data analysis. Further, the sensitive data resulting from data analysis should not be published if it breaches an individual's data privacy. Numerous methods such as k-anonymity, l-diversity, and t-closeness privacy models, encryption-based methods, and associative rule-based methods have been proposed in the literature to preserve data privacy. However, these methods have more data distortion and less data utility. The proposed approach scales down high-dimensional data by finding attributes correlation in the dataset through Constraints-Relaxed Functional Dependency (CFDs). If correlated attributes violate privacy according to user requirements or government policies, it finds a minimal set of correlated attributes to be obscured using heuristic Minimal Vertex Forward Set (MVFS) and encrypts such attributes using block cipher method. The proposed method minimizes the number of attributes to be obscured and enhances data usage while preserving information confidentiality. All experiments are carried out using Apache Spark on a cloud environment with two different datasets: Heart-Disease, Income-Census (KDD) [39]. The experimental results show the number of attributes to be obscured under different configuration settings of CFDs for Heart-Disease, Income-Census dataset. The outcome of the experiment illustrates a correlation between attributes in the dataset. The results establish a relation between the number of attributes to be obscured and the level of information confidentiality.
ISSN:	1816-093X 1816-0948