Desensitized RDCA Subspaces for Compressive Privacy in Machine Learning
The quest for better data analysis and artificial intelligence has lead to more and more data being collected and stored. As a consequence, more data are exposed to malicious entities. This paper examines the problem of privacy in machine learning for classification. We utilize the Ridge Discriminan...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2017-07 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Filipowicz, Artur Chanyaswad, Thee Kung, S Y |
description | The quest for better data analysis and artificial intelligence has lead to more and more data being collected and stored. As a consequence, more data are exposed to malicious entities. This paper examines the problem of privacy in machine learning for classification. We utilize the Ridge Discriminant Component Analysis (RDCA) to desensitize data with respect to a privacy label. Based on five experiments, we show that desensitization by RDCA can effectively protect privacy (i.e. low accuracy on the privacy label) with small loss in utility. On HAR and CMU Faces datasets, the use of desensitized data results in random guess level accuracies for privacy at a cost of 5.14% and 0.04%, on average, drop in the utility accuracies. For Semeion Handwritten Digit dataset, accuracies of the privacy-sensitive digits are almost zero, while the accuracies for the utility-relevant digits drop by 7.53% on average. This presents a promising solution to the problem of privacy in machine learning for classification. |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2075792471</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2075792471</sourcerecordid><originalsourceid>FETCH-proquest_journals_20757924713</originalsourceid><addsrcrecordid>eNqNyr0KwjAUQOEgCBbtO1xwLqRJa3SU1p9BQdS9xHqrKZrU3FbQp9fBB3A6w3d6LBBSxtE0EWLAQqKacy4mSqSpDNgqR0JLpjVvPMM-z-Zw6E7U6BIJKuchc_fGI5F5Iuy8eeryBcbCVpdXYxE2qL019jJi_UrfCMNfh2y8XByzddR49-iQ2qJ2nbdfKgRXqZqJRMXyv-sDy747Qw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2075792471</pqid></control><display><type>article</type><title>Desensitized RDCA Subspaces for Compressive Privacy in Machine Learning</title><source>Free E- Journals</source><creator>Filipowicz, Artur ; Chanyaswad, Thee ; Kung, S Y</creator><creatorcontrib>Filipowicz, Artur ; Chanyaswad, Thee ; Kung, S Y</creatorcontrib><description>The quest for better data analysis and artificial intelligence has lead to more and more data being collected and stored. As a consequence, more data are exposed to malicious entities. This paper examines the problem of privacy in machine learning for classification. We utilize the Ridge Discriminant Component Analysis (RDCA) to desensitize data with respect to a privacy label. Based on five experiments, we show that desensitization by RDCA can effectively protect privacy (i.e. low accuracy on the privacy label) with small loss in utility. On HAR and CMU Faces datasets, the use of desensitized data results in random guess level accuracies for privacy at a cost of 5.14% and 0.04%, on average, drop in the utility accuracies. For Semeion Handwritten Digit dataset, accuracies of the privacy-sensitive digits are almost zero, while the accuracies for the utility-relevant digits drop by 7.53% on average. This presents a promising solution to the problem of privacy in machine learning for classification.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Artificial intelligence ; Classification ; Data analysis ; Desensitization ; Digits ; Handwriting ; Machine learning ; Privacy ; Subspaces</subject><ispartof>arXiv.org, 2017-07</ispartof><rights>2017. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Filipowicz, Artur</creatorcontrib><creatorcontrib>Chanyaswad, Thee</creatorcontrib><creatorcontrib>Kung, S Y</creatorcontrib><title>Desensitized RDCA Subspaces for Compressive Privacy in Machine Learning</title><title>arXiv.org</title><description>The quest for better data analysis and artificial intelligence has lead to more and more data being collected and stored. As a consequence, more data are exposed to malicious entities. This paper examines the problem of privacy in machine learning for classification. We utilize the Ridge Discriminant Component Analysis (RDCA) to desensitize data with respect to a privacy label. Based on five experiments, we show that desensitization by RDCA can effectively protect privacy (i.e. low accuracy on the privacy label) with small loss in utility. On HAR and CMU Faces datasets, the use of desensitized data results in random guess level accuracies for privacy at a cost of 5.14% and 0.04%, on average, drop in the utility accuracies. For Semeion Handwritten Digit dataset, accuracies of the privacy-sensitive digits are almost zero, while the accuracies for the utility-relevant digits drop by 7.53% on average. This presents a promising solution to the problem of privacy in machine learning for classification.</description><subject>Artificial intelligence</subject><subject>Classification</subject><subject>Data analysis</subject><subject>Desensitization</subject><subject>Digits</subject><subject>Handwriting</subject><subject>Machine learning</subject><subject>Privacy</subject><subject>Subspaces</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNyr0KwjAUQOEgCBbtO1xwLqRJa3SU1p9BQdS9xHqrKZrU3FbQp9fBB3A6w3d6LBBSxtE0EWLAQqKacy4mSqSpDNgqR0JLpjVvPMM-z-Zw6E7U6BIJKuchc_fGI5F5Iuy8eeryBcbCVpdXYxE2qL019jJi_UrfCMNfh2y8XByzddR49-iQ2qJ2nbdfKgRXqZqJRMXyv-sDy747Qw</recordid><startdate>20170724</startdate><enddate>20170724</enddate><creator>Filipowicz, Artur</creator><creator>Chanyaswad, Thee</creator><creator>Kung, S Y</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20170724</creationdate><title>Desensitized RDCA Subspaces for Compressive Privacy in Machine Learning</title><author>Filipowicz, Artur ; Chanyaswad, Thee ; Kung, S Y</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_20757924713</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Artificial intelligence</topic><topic>Classification</topic><topic>Data analysis</topic><topic>Desensitization</topic><topic>Digits</topic><topic>Handwriting</topic><topic>Machine learning</topic><topic>Privacy</topic><topic>Subspaces</topic><toplevel>online_resources</toplevel><creatorcontrib>Filipowicz, Artur</creatorcontrib><creatorcontrib>Chanyaswad, Thee</creatorcontrib><creatorcontrib>Kung, S Y</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Filipowicz, Artur</au><au>Chanyaswad, Thee</au><au>Kung, S Y</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Desensitized RDCA Subspaces for Compressive Privacy in Machine Learning</atitle><jtitle>arXiv.org</jtitle><date>2017-07-24</date><risdate>2017</risdate><eissn>2331-8422</eissn><abstract>The quest for better data analysis and artificial intelligence has lead to more and more data being collected and stored. As a consequence, more data are exposed to malicious entities. This paper examines the problem of privacy in machine learning for classification. We utilize the Ridge Discriminant Component Analysis (RDCA) to desensitize data with respect to a privacy label. Based on five experiments, we show that desensitization by RDCA can effectively protect privacy (i.e. low accuracy on the privacy label) with small loss in utility. On HAR and CMU Faces datasets, the use of desensitized data results in random guess level accuracies for privacy at a cost of 5.14% and 0.04%, on average, drop in the utility accuracies. For Semeion Handwritten Digit dataset, accuracies of the privacy-sensitive digits are almost zero, while the accuracies for the utility-relevant digits drop by 7.53% on average. This presents a promising solution to the problem of privacy in machine learning for classification.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2017-07 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2075792471 |
source | Free E- Journals |
subjects | Artificial intelligence Classification Data analysis Desensitization Digits Handwriting Machine learning Privacy Subspaces |
title | Desensitized RDCA Subspaces for Compressive Privacy in Machine Learning |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T12%3A52%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Desensitized%20RDCA%20Subspaces%20for%20Compressive%20Privacy%20in%20Machine%20Learning&rft.jtitle=arXiv.org&rft.au=Filipowicz,%20Artur&rft.date=2017-07-24&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2075792471%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2075792471&rft_id=info:pmid/&rfr_iscdi=true |