PCOR: Private Contextual Outlier Release via Differentially Private Search

Outlier detection plays a significant role in various real world applications such as intrusion, malfunction, and fraud detection. Traditionally, outlier detection techniques are applied to find outliers in the context of the whole dataset. However, this practice neglects contextual outliers, that a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2021-03
Hauptverfasser: Shafieinejad, Masoumeh, Kerschbaum, Florian, Ilyas, Ihab F
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Shafieinejad, Masoumeh
Kerschbaum, Florian
Ilyas, Ihab F
description Outlier detection plays a significant role in various real world applications such as intrusion, malfunction, and fraud detection. Traditionally, outlier detection techniques are applied to find outliers in the context of the whole dataset. However, this practice neglects contextual outliers, that are not outliers in the whole dataset but in some specific neighborhoods. Contextual outliers are particularly important in data exploration and targeted anomaly explanation and diagnosis. In these scenarios, the data owner computes the following information: i) The attributes that contribute to the abnormality of an outlier (metric), ii) Contextual description of the outlier's neighborhoods (context), and iii) The utility score of the context, e.g. its strength in showing the outlier's significance, or in relation to a particular explanation for the outlier. However, revealing the outlier's context leaks information about the other individuals in the population as well, violating their privacy. We address the issue of population privacy violations in this paper, and propose a solution for the two main challenges. In this setting, the data owner is required to release a valid context for the queried record, i.e. a context in which the record is an outlier. Hence, the first major challenge is that the privacy technique must preserve the validity of the context for each record. We propose techniques to protect the privacy of individuals through a relaxed notion of differential privacy to satisfy this requirement. The second major challenge is applying the proposed techniques efficiently, as they impose intensive computation to the base algorithm. To overcome this challenge, we propose a graph structure to map the contexts to, and introduce differentially private graph search algorithms as efficient solutions for the computation problem caused by differential privacy techniques.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2499695948</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2499695948</sourcerecordid><originalsourceid>FETCH-proquest_journals_24996959483</originalsourceid><addsrcrecordid>eNqNi70KwjAYAIMgWLTvEHAuxKStjWtUxKWldS8f8hVTQqv5Kfr2Ooiz0w13NyMRF2KTFCnnCxI71zPGeL7lWSYicq5UWe9oZfUEHqkaB49PH8DQMnij0dIaDYJDOmmge911aHHwGox5_a4GwV5vKzLvwDiMv1yS9fFwUafkbsdHQOfbfgx2-KiWp1LmMpNpIf6r3oXWPIk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2499695948</pqid></control><display><type>article</type><title>PCOR: Private Contextual Outlier Release via Differentially Private Search</title><source>Free E- Journals</source><creator>Shafieinejad, Masoumeh ; Kerschbaum, Florian ; Ilyas, Ihab F</creator><creatorcontrib>Shafieinejad, Masoumeh ; Kerschbaum, Florian ; Ilyas, Ihab F</creatorcontrib><description>Outlier detection plays a significant role in various real world applications such as intrusion, malfunction, and fraud detection. Traditionally, outlier detection techniques are applied to find outliers in the context of the whole dataset. However, this practice neglects contextual outliers, that are not outliers in the whole dataset but in some specific neighborhoods. Contextual outliers are particularly important in data exploration and targeted anomaly explanation and diagnosis. In these scenarios, the data owner computes the following information: i) The attributes that contribute to the abnormality of an outlier (metric), ii) Contextual description of the outlier's neighborhoods (context), and iii) The utility score of the context, e.g. its strength in showing the outlier's significance, or in relation to a particular explanation for the outlier. However, revealing the outlier's context leaks information about the other individuals in the population as well, violating their privacy. We address the issue of population privacy violations in this paper, and propose a solution for the two main challenges. In this setting, the data owner is required to release a valid context for the queried record, i.e. a context in which the record is an outlier. Hence, the first major challenge is that the privacy technique must preserve the validity of the context for each record. We propose techniques to protect the privacy of individuals through a relaxed notion of differential privacy to satisfy this requirement. The second major challenge is applying the proposed techniques efficiently, as they impose intensive computation to the base algorithm. To overcome this challenge, we propose a graph structure to map the contexts to, and introduce differentially private graph search algorithms as efficient solutions for the computation problem caused by differential privacy techniques.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Computational efficiency ; Context ; Data analysis ; Datasets ; Fraud ; Outliers (statistics) ; Privacy ; Search algorithms</subject><ispartof>arXiv.org, 2021-03</ispartof><rights>2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Shafieinejad, Masoumeh</creatorcontrib><creatorcontrib>Kerschbaum, Florian</creatorcontrib><creatorcontrib>Ilyas, Ihab F</creatorcontrib><title>PCOR: Private Contextual Outlier Release via Differentially Private Search</title><title>arXiv.org</title><description>Outlier detection plays a significant role in various real world applications such as intrusion, malfunction, and fraud detection. Traditionally, outlier detection techniques are applied to find outliers in the context of the whole dataset. However, this practice neglects contextual outliers, that are not outliers in the whole dataset but in some specific neighborhoods. Contextual outliers are particularly important in data exploration and targeted anomaly explanation and diagnosis. In these scenarios, the data owner computes the following information: i) The attributes that contribute to the abnormality of an outlier (metric), ii) Contextual description of the outlier's neighborhoods (context), and iii) The utility score of the context, e.g. its strength in showing the outlier's significance, or in relation to a particular explanation for the outlier. However, revealing the outlier's context leaks information about the other individuals in the population as well, violating their privacy. We address the issue of population privacy violations in this paper, and propose a solution for the two main challenges. In this setting, the data owner is required to release a valid context for the queried record, i.e. a context in which the record is an outlier. Hence, the first major challenge is that the privacy technique must preserve the validity of the context for each record. We propose techniques to protect the privacy of individuals through a relaxed notion of differential privacy to satisfy this requirement. The second major challenge is applying the proposed techniques efficiently, as they impose intensive computation to the base algorithm. To overcome this challenge, we propose a graph structure to map the contexts to, and introduce differentially private graph search algorithms as efficient solutions for the computation problem caused by differential privacy techniques.</description><subject>Computational efficiency</subject><subject>Context</subject><subject>Data analysis</subject><subject>Datasets</subject><subject>Fraud</subject><subject>Outliers (statistics)</subject><subject>Privacy</subject><subject>Search algorithms</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNi70KwjAYAIMgWLTvEHAuxKStjWtUxKWldS8f8hVTQqv5Kfr2Ooiz0w13NyMRF2KTFCnnCxI71zPGeL7lWSYicq5UWe9oZfUEHqkaB49PH8DQMnij0dIaDYJDOmmge911aHHwGox5_a4GwV5vKzLvwDiMv1yS9fFwUafkbsdHQOfbfgx2-KiWp1LmMpNpIf6r3oXWPIk</recordid><startdate>20210309</startdate><enddate>20210309</enddate><creator>Shafieinejad, Masoumeh</creator><creator>Kerschbaum, Florian</creator><creator>Ilyas, Ihab F</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20210309</creationdate><title>PCOR: Private Contextual Outlier Release via Differentially Private Search</title><author>Shafieinejad, Masoumeh ; Kerschbaum, Florian ; Ilyas, Ihab F</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_24996959483</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Computational efficiency</topic><topic>Context</topic><topic>Data analysis</topic><topic>Datasets</topic><topic>Fraud</topic><topic>Outliers (statistics)</topic><topic>Privacy</topic><topic>Search algorithms</topic><toplevel>online_resources</toplevel><creatorcontrib>Shafieinejad, Masoumeh</creatorcontrib><creatorcontrib>Kerschbaum, Florian</creatorcontrib><creatorcontrib>Ilyas, Ihab F</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Shafieinejad, Masoumeh</au><au>Kerschbaum, Florian</au><au>Ilyas, Ihab F</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>PCOR: Private Contextual Outlier Release via Differentially Private Search</atitle><jtitle>arXiv.org</jtitle><date>2021-03-09</date><risdate>2021</risdate><eissn>2331-8422</eissn><abstract>Outlier detection plays a significant role in various real world applications such as intrusion, malfunction, and fraud detection. Traditionally, outlier detection techniques are applied to find outliers in the context of the whole dataset. However, this practice neglects contextual outliers, that are not outliers in the whole dataset but in some specific neighborhoods. Contextual outliers are particularly important in data exploration and targeted anomaly explanation and diagnosis. In these scenarios, the data owner computes the following information: i) The attributes that contribute to the abnormality of an outlier (metric), ii) Contextual description of the outlier's neighborhoods (context), and iii) The utility score of the context, e.g. its strength in showing the outlier's significance, or in relation to a particular explanation for the outlier. However, revealing the outlier's context leaks information about the other individuals in the population as well, violating their privacy. We address the issue of population privacy violations in this paper, and propose a solution for the two main challenges. In this setting, the data owner is required to release a valid context for the queried record, i.e. a context in which the record is an outlier. Hence, the first major challenge is that the privacy technique must preserve the validity of the context for each record. We propose techniques to protect the privacy of individuals through a relaxed notion of differential privacy to satisfy this requirement. The second major challenge is applying the proposed techniques efficiently, as they impose intensive computation to the base algorithm. To overcome this challenge, we propose a graph structure to map the contexts to, and introduce differentially private graph search algorithms as efficient solutions for the computation problem caused by differential privacy techniques.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2021-03
issn 2331-8422
language eng
recordid cdi_proquest_journals_2499695948
source Free E- Journals
subjects Computational efficiency
Context
Data analysis
Datasets
Fraud
Outliers (statistics)
Privacy
Search algorithms
title PCOR: Private Contextual Outlier Release via Differentially Private Search
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-19T09%3A20%3A18IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=PCOR:%20Private%20Contextual%20Outlier%20Release%20via%20Differentially%20Private%20Search&rft.jtitle=arXiv.org&rft.au=Shafieinejad,%20Masoumeh&rft.date=2021-03-09&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2499695948%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2499695948&rft_id=info:pmid/&rfr_iscdi=true