iCAT+: An Interactive Customizable Anonymization Tool Using Automated Translation Through Deep Learning

Data anonymization is a viable solution for data owners to mitigate their privacy concerns. However, existing data anonymization tools are inflexible to support various privacy and utility requirements of both data owners and data users. In most cases, this limitation is due to a lack of understandi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on dependable and secure computing 2024-07, Vol.21 (4), p.2799-2817
Hauptverfasser: Oqaily, Momen, Kabir, Mohammad Ekramul, Majumdar, Suryadipta, Jarraya, Yosr, Zhang, Mengyuan, Pourzandi, Makan, Wang, Lingyu, Debbabi, Mourad
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Data anonymization is a viable solution for data owners to mitigate their privacy concerns. However, existing data anonymization tools are inflexible to support various privacy and utility requirements of both data owners and data users. In most cases, this limitation is due to a lack of understanding of those requirements as well as the non-customizability of the existing tools. To address this limitation, we propose iCAT+ , which is an interactive and customizable anonymization approach. More specifically, we first automate the interpretation of data owners' and data users' textual requirements by deploying a Convolutional Neural Network (CNN) model for Natural Language Processing (NLP). Second, we introduce the concept of the anonymization space to model possible combinations of per-attribute anonymization primitives based on the level of privacy and utility that each primitive provides. Third, we design an ontology model that maps the translated requirements into their appropriate anonymization primitives in the defined anonymization space corresponding to the plain data. Fourth, we evaluate the efficiency and effectiveness of iCAT+ based on both real and synthetic network data. Finally, we assess its usability through a real user study involving participants from industry and research laboratories. Our experiments show the effectiveness and efficiency of our solution (e.g., requirement translation accuracy of 99% at the data owner side and 98% at the data user side, with a computational time of around one minute for the Google cluster dataset).
ISSN:1545-5971
1941-0018
DOI:10.1109/TDSC.2023.3317806