Constraints-Relaxed Functional Dependency based Data Privacy Preservation Model
The data privacy preservation technique must ensure data integrity and prevent the invasion of confidential data from unsolicited or unapproved usage by any authorized or unauthorized user. Meanwhile, genuine users can use data for legal purposes. Confidential data should be excluded from data analy...
Gespeichert in:
Veröffentlicht in: | Engineering letters 2023-02, Vol.31 (1), p.19 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The data privacy preservation technique must ensure data integrity and prevent the invasion of confidential data from unsolicited or unapproved usage by any authorized or unauthorized user. Meanwhile, genuine users can use data for legal purposes. Confidential data should be excluded from data analysis. Further, the sensitive data resulting from data analysis should not be published if it breaches an individual's data privacy. Numerous methods such as k-anonymity, l-diversity, and t-closeness privacy models, encryption-based methods, and associative rule-based methods have been proposed in the literature to preserve data privacy. However, these methods have more data distortion and less data utility. The proposed approach scales down high-dimensional data by finding attributes correlation in the dataset through Constraints-Relaxed Functional Dependency (CFDs). If correlated attributes violate privacy according to user requirements or government policies, it finds a minimal set of correlated attributes to be obscured using heuristic Minimal Vertex Forward Set (MVFS) and encrypts such attributes using block cipher method. The proposed method minimizes the number of attributes to be obscured and enhances data usage while preserving information confidentiality. All experiments are carried out using Apache Spark on a cloud environment with two different datasets: Heart-Disease, Income-Census (KDD) [39]. The experimental results show the number of attributes to be obscured under different configuration settings of CFDs for Heart-Disease, Income-Census dataset. The outcome of the experiment illustrates a correlation between attributes in the dataset. The results establish a relation between the number of attributes to be obscured and the level of information confidentiality. |
---|---|
ISSN: | 1816-093X 1816-0948 |