Exploring privacy-preserving models in model space

Privacy-preserving techniques have become increasingly essential in the rapidly advancing era of artificial intelligence (AI), particularly in areas such as deep learning (DL). A key architecture in DL is the Multilayer Perceptron (MLP) network, a type of feedforward neural network. MLPs consist of...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	Varshney, Ayush K.
Format:	Dissertation
Sprache:	eng
Schlagworte:	Computer Science datalogi
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Privacy-preserving techniques have become increasingly essential in the rapidly advancing era of artificial intelligence (AI), particularly in areas such as deep learning (DL). A key architecture in DL is the Multilayer Perceptron (MLP) network, a type of feedforward neural network. MLPs consist of at least three layers of nodes: an input layer, hidden layers, and an output layer. Each node, except for input nodes, is a neuron with a nonlinear activation function. MLPs are capable of learning complex models due to their deep structure and non-linear processing layers. However, the extensive data requirements of MLPs, often including sensitive information, make privacy a crucial concern. Several types of privacy attacks are specifically designed to target Deep Learning learning (DL) models like MLPs, potentially leading to information leakage. Therefore, implementing privacy-preserving approaches is crucial to prevent such leaks. Most privacy-preserving methods focus either on protecting privacy at the database level or during inference (output) from the model. Both approaches have practical limitations. In this thesis, we explore a novel privacy-preserving approach for DL models which focuses on choosing anonymous models, i.e., models that can be generated by a set of different datasets. This privacy approach is called Integral Privacy (IP). IP provide sound defense against Membership Inference Attacks (MIA), which aims to determine whether a sample was part of the training set. Considering the vast number of parameters in DL models, searching the model space for recurring models can be computationally intensive and time-consuming. To address this challenge, we present a relaxed variation of IP called $\Delta$-Integral Privacy ($\Delta$-IP), where two models are considered equivalent if their difference is within some $\Delta$ threshold. We also highlight the challenge of comparing two DNNs, particularly when similar layers in different networks may contain neurons that are permutations or combinations of one another. This adds complexity to the concept of IP, as identifying equivalencies between such models is not straightforward. In addition, we present a methodology, along with its theoretical analysis, for generating a set of integrally private DL models. In practice, data often arrives rapidly and in large volumes, and its statistical properties can change over time. Detecting and adapting to such drifts is crucial for maintaining model's reliable pre