Deep Unlearn: Benchmarking Machine Unlearning

Machine unlearning (MU) aims to remove the influence of particular data points from the learnable parameters of a trained machine learning model. This is a crucial capability in light of data privacy requirements, trustworthiness, and safety in deployed models. MU is particularly challenging for dee...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-10
Hauptverfasser: Cadet, Xavier F, Borovykh, Anastasia, Malekzadeh, Mohammad, Ahmadi-Abhari, Sara, Haddadi, Hamed
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Cadet, Xavier F
Borovykh, Anastasia
Malekzadeh, Mohammad
Ahmadi-Abhari, Sara
Haddadi, Hamed
description Machine unlearning (MU) aims to remove the influence of particular data points from the learnable parameters of a trained machine learning model. This is a crucial capability in light of data privacy requirements, trustworthiness, and safety in deployed models. MU is particularly challenging for deep neural networks (DNNs), such as convolutional nets or vision transformers, as such DNNs tend to memorize a notable portion of their training dataset. Nevertheless, the community lacks a rigorous and multifaceted study that looks into the success of MU methods for DNNs. In this paper, we investigate 18 state-of-the-art MU methods across various benchmark datasets and models, with each evaluation conducted over 10 different initializations, a comprehensive evaluation involving MU over 100K models. We show that, with the proper hyperparameters, Masked Small Gradients (MSG) and Convolution Transpose (CT), consistently perform better in terms of model accuracy and run-time efficiency across different models, datasets, and initializations, assessed by population-based membership inference attacks (MIA) and per-sample unlearning likelihood ratio attacks (U-LiRA). Furthermore, our benchmark highlights the fact that comparing a MU method only with commonly used baselines, such as Gradient Ascent (GA) or Successive Random Relabeling (SRL), is inadequate, and we need better baselines like Negative Gradient Plus (NG+) with proper hyperparameter selection.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3112657012</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3112657012</sourcerecordid><originalsourceid>FETCH-proquest_journals_31126570123</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mTQdUlNLVAIzctJTSzKs1JwSs1LzshNLMrOzEtX8E1MzsjMS4XJAoV4GFjTEnOKU3mhNDeDsptriLOHbkFRfmFpanFJfFZ-aVEeUCre2NDQyMzU3MDQyJg4VQCL8TGI</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3112657012</pqid></control><display><type>article</type><title>Deep Unlearn: Benchmarking Machine Unlearning</title><source>Free E- Journals</source><creator>Cadet, Xavier F ; Borovykh, Anastasia ; Malekzadeh, Mohammad ; Ahmadi-Abhari, Sara ; Haddadi, Hamed</creator><creatorcontrib>Cadet, Xavier F ; Borovykh, Anastasia ; Malekzadeh, Mohammad ; Ahmadi-Abhari, Sara ; Haddadi, Hamed</creatorcontrib><description>Machine unlearning (MU) aims to remove the influence of particular data points from the learnable parameters of a trained machine learning model. This is a crucial capability in light of data privacy requirements, trustworthiness, and safety in deployed models. MU is particularly challenging for deep neural networks (DNNs), such as convolutional nets or vision transformers, as such DNNs tend to memorize a notable portion of their training dataset. Nevertheless, the community lacks a rigorous and multifaceted study that looks into the success of MU methods for DNNs. In this paper, we investigate 18 state-of-the-art MU methods across various benchmark datasets and models, with each evaluation conducted over 10 different initializations, a comprehensive evaluation involving MU over 100K models. We show that, with the proper hyperparameters, Masked Small Gradients (MSG) and Convolution Transpose (CT), consistently perform better in terms of model accuracy and run-time efficiency across different models, datasets, and initializations, assessed by population-based membership inference attacks (MIA) and per-sample unlearning likelihood ratio attacks (U-LiRA). Furthermore, our benchmark highlights the fact that comparing a MU method only with commonly used baselines, such as Gradient Ascent (GA) or Successive Random Relabeling (SRL), is inadequate, and we need better baselines like Negative Gradient Plus (NG+) with proper hyperparameter selection.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Artificial neural networks ; Benchmarks ; Data points ; Datasets ; Likelihood ratio ; Machine learning ; State-of-the-art reviews</subject><ispartof>arXiv.org, 2024-10</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Cadet, Xavier F</creatorcontrib><creatorcontrib>Borovykh, Anastasia</creatorcontrib><creatorcontrib>Malekzadeh, Mohammad</creatorcontrib><creatorcontrib>Ahmadi-Abhari, Sara</creatorcontrib><creatorcontrib>Haddadi, Hamed</creatorcontrib><title>Deep Unlearn: Benchmarking Machine Unlearning</title><title>arXiv.org</title><description>Machine unlearning (MU) aims to remove the influence of particular data points from the learnable parameters of a trained machine learning model. This is a crucial capability in light of data privacy requirements, trustworthiness, and safety in deployed models. MU is particularly challenging for deep neural networks (DNNs), such as convolutional nets or vision transformers, as such DNNs tend to memorize a notable portion of their training dataset. Nevertheless, the community lacks a rigorous and multifaceted study that looks into the success of MU methods for DNNs. In this paper, we investigate 18 state-of-the-art MU methods across various benchmark datasets and models, with each evaluation conducted over 10 different initializations, a comprehensive evaluation involving MU over 100K models. We show that, with the proper hyperparameters, Masked Small Gradients (MSG) and Convolution Transpose (CT), consistently perform better in terms of model accuracy and run-time efficiency across different models, datasets, and initializations, assessed by population-based membership inference attacks (MIA) and per-sample unlearning likelihood ratio attacks (U-LiRA). Furthermore, our benchmark highlights the fact that comparing a MU method only with commonly used baselines, such as Gradient Ascent (GA) or Successive Random Relabeling (SRL), is inadequate, and we need better baselines like Negative Gradient Plus (NG+) with proper hyperparameter selection.</description><subject>Artificial neural networks</subject><subject>Benchmarks</subject><subject>Data points</subject><subject>Datasets</subject><subject>Likelihood ratio</subject><subject>Machine learning</subject><subject>State-of-the-art reviews</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mTQdUlNLVAIzctJTSzKs1JwSs1LzshNLMrOzEtX8E1MzsjMS4XJAoV4GFjTEnOKU3mhNDeDsptriLOHbkFRfmFpanFJfFZ-aVEeUCre2NDQyMzU3MDQyJg4VQCL8TGI</recordid><startdate>20241002</startdate><enddate>20241002</enddate><creator>Cadet, Xavier F</creator><creator>Borovykh, Anastasia</creator><creator>Malekzadeh, Mohammad</creator><creator>Ahmadi-Abhari, Sara</creator><creator>Haddadi, Hamed</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20241002</creationdate><title>Deep Unlearn: Benchmarking Machine Unlearning</title><author>Cadet, Xavier F ; Borovykh, Anastasia ; Malekzadeh, Mohammad ; Ahmadi-Abhari, Sara ; Haddadi, Hamed</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31126570123</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Artificial neural networks</topic><topic>Benchmarks</topic><topic>Data points</topic><topic>Datasets</topic><topic>Likelihood ratio</topic><topic>Machine learning</topic><topic>State-of-the-art reviews</topic><toplevel>online_resources</toplevel><creatorcontrib>Cadet, Xavier F</creatorcontrib><creatorcontrib>Borovykh, Anastasia</creatorcontrib><creatorcontrib>Malekzadeh, Mohammad</creatorcontrib><creatorcontrib>Ahmadi-Abhari, Sara</creatorcontrib><creatorcontrib>Haddadi, Hamed</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection (ProQuest)</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Cadet, Xavier F</au><au>Borovykh, Anastasia</au><au>Malekzadeh, Mohammad</au><au>Ahmadi-Abhari, Sara</au><au>Haddadi, Hamed</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Deep Unlearn: Benchmarking Machine Unlearning</atitle><jtitle>arXiv.org</jtitle><date>2024-10-02</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Machine unlearning (MU) aims to remove the influence of particular data points from the learnable parameters of a trained machine learning model. This is a crucial capability in light of data privacy requirements, trustworthiness, and safety in deployed models. MU is particularly challenging for deep neural networks (DNNs), such as convolutional nets or vision transformers, as such DNNs tend to memorize a notable portion of their training dataset. Nevertheless, the community lacks a rigorous and multifaceted study that looks into the success of MU methods for DNNs. In this paper, we investigate 18 state-of-the-art MU methods across various benchmark datasets and models, with each evaluation conducted over 10 different initializations, a comprehensive evaluation involving MU over 100K models. We show that, with the proper hyperparameters, Masked Small Gradients (MSG) and Convolution Transpose (CT), consistently perform better in terms of model accuracy and run-time efficiency across different models, datasets, and initializations, assessed by population-based membership inference attacks (MIA) and per-sample unlearning likelihood ratio attacks (U-LiRA). Furthermore, our benchmark highlights the fact that comparing a MU method only with commonly used baselines, such as Gradient Ascent (GA) or Successive Random Relabeling (SRL), is inadequate, and we need better baselines like Negative Gradient Plus (NG+) with proper hyperparameter selection.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-10
issn 2331-8422
language eng
recordid cdi_proquest_journals_3112657012
source Free E- Journals
subjects Artificial neural networks
Benchmarks
Data points
Datasets
Likelihood ratio
Machine learning
State-of-the-art reviews
title Deep Unlearn: Benchmarking Machine Unlearning
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-09T03%3A01%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Deep%20Unlearn:%20Benchmarking%20Machine%20Unlearning&rft.jtitle=arXiv.org&rft.au=Cadet,%20Xavier%20F&rft.date=2024-10-02&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3112657012%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3112657012&rft_id=info:pmid/&rfr_iscdi=true