Protecting Decision Boundary of Machine Learning Model With Differentially Private Perturbation

Machine learning service API allows model owners to monetize proprietary models by offering prediction services to third-party users. However, existing literature shows that model parameters are vulnerable to extraction attacks which accumulate prediction queries and their responses to train a repli...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on dependable and secure computing 2022-05, Vol.19 (3), p.2007-2022
Hauptverfasser:	Zheng, Huadi, Ye, Qingqing, Hu, Haibo, Fang, Chengfang, Shi, Jie
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptation models Adaptive algorithms adversarial machine learning Algorithms boundary differential privacy Computational modeling Differential privacy Machine learning Model defense model extraction Perturbation Perturbation methods Prediction algorithms Predictive models Privacy Queries
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Machine learning service API allows model owners to monetize proprietary models by offering prediction services to third-party users. However, existing literature shows that model parameters are vulnerable to extraction attacks which accumulate prediction queries and their responses to train a replica model. As countermeasures, researchers have proposed to reduce the rich API output, such as hiding the precise confidence. Nonetheless, even with response being only one bit, an adversary can still exploit fine-tuned queries with differential property to infer the decision boundary of the underlying model. In this article, we propose boundary differential privacy (BDP) against such attacks by obfuscating the prediction responses with noises. BDP guarantees an adversary cannot learn the decision boundary of any two classes by a predefined precision no matter how many queries are issued to the prediction API. We first design a perturbation algorithm called boundary randomized response for a binary model. Then we prove it satisfies \epsilon ε -BDP, followed by a generalization of this algorithm to a multiclass model. Finally, we generalize a hard boundary to soft boundary and design an adaptive perturbation algorithm that can still work in the latter case. The effectiveness and high utility of our solution are verified by extensive experiments on both linear and non-linear models.
ISSN:	1545-5971 1941-0018
DOI:	10.1109/TDSC.2020.3043382