Privacy-Preserving Non-Negative Matrix Factorization with Outliers

Non-negative matrix factorization is a popular unsupervised machine learning algorithm for extracting meaningful features from inherently non-negative data. Such data often contain privacy-sensitive user information. Additionally, the dataset can contain outliers, which may lead to extracting sub-op...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ACM transactions on knowledge discovery from data 2024-01, Vol.18 (3), p.1-26, Article 64
Hauptverfasser: Saha, Swapnil, Imtiaz, Hafiz
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Non-negative matrix factorization is a popular unsupervised machine learning algorithm for extracting meaningful features from inherently non-negative data. Such data often contain privacy-sensitive user information. Additionally, the dataset can contain outliers, which may lead to extracting sub-optimal features from the data. It is, therefore, necessary to address these two issues while analyzing privacy-sensitive data that may contain outliers. In this work, we develop a non-negative matrix factorization algorithm in the privacy-preserving framework that (i) considers the presence of outliers in the data, and (ii) can achieve results comparable to those of the non-private algorithm. We design our method in such a way that one has the control to select the degree of privacy grantee based on the required utility gap. We show the effectiveness of our proposed algorithm’s performance on six real and diverse datasets. The experimental results show that our proposed method can achieve a performance that closely approximates the performance of the non-private algorithm under some parameter choices, while ensuring strict privacy guarantees.
ISSN:1556-4681
1556-472X
DOI:10.1145/3632961