ALTBI: Constructing Improved Outlier Detection Models via Optimization of Inlier-Memorization Effect
Outlier detection (OD) is the task of identifying unusual observations (or outliers) from a given or upcoming data by learning unique patterns of normal observations (or inliers). Recently, a study introduced a powerful unsupervised OD (UOD) solver based on a new observation of deep generative model...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Cho, Seoyoung Hwang, Jaesung Bak, Kwan-Young Kim, Dongha |
description | Outlier detection (OD) is the task of identifying unusual observations (or
outliers) from a given or upcoming data by learning unique patterns of normal
observations (or inliers). Recently, a study introduced a powerful unsupervised
OD (UOD) solver based on a new observation of deep generative models, called
inlier-memorization (IM) effect, which suggests that generative models memorize
inliers before outliers in early learning stages. In this study, we aim to
develop a theoretically principled method to address UOD tasks by maximally
utilizing the IM effect. We begin by observing that the IM effect is observed
more clearly when the given training data contain fewer outliers. This finding
indicates a potential for enhancing the IM effect in UOD regimes if we can
effectively exclude outliers from mini-batches when designing the loss
function. To this end, we introduce two main techniques: 1) increasing the
mini-batch size as the model training proceeds and 2) using an adaptive
threshold to calculate the truncated loss function. We theoretically show that
these two techniques effectively filter out outliers from the truncated loss
function, allowing us to utilize the IM effect to the fullest. Coupled with an
additional ensemble strategy, we propose our method and term it Adaptive Loss
Truncation with Batch Increment (ALTBI). We provide extensive experimental
results to demonstrate that ALTBI achieves state-of-the-art performance in
identifying outliers compared to other recent methods, even with significantly
lower computation costs. Additionally, we show that our method yields robust
performances when combined with privacy-preserving algorithms. |
doi_str_mv | 10.48550/arxiv.2408.09791 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2408_09791</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2408_09791</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2408_097913</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw0DOwNLc05GRIcfQJcfK0UnDOzysuKSpNLsnMS1fwzC0oyi9LTVHwLy3JyUwtUnBJLUkFSuXnKfjmp6TmFCuUZSYq-BeUZOZmViWCxfPTFDzzQGp1fVNz84tgwq5paUCNPAysaYk5xam8UJqbQd7NNcTZQxfsnviCoszcxKLKeJC74sHuMiasAgB7vkQd</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>ALTBI: Constructing Improved Outlier Detection Models via Optimization of Inlier-Memorization Effect</title><source>arXiv.org</source><creator>Cho, Seoyoung ; Hwang, Jaesung ; Bak, Kwan-Young ; Kim, Dongha</creator><creatorcontrib>Cho, Seoyoung ; Hwang, Jaesung ; Bak, Kwan-Young ; Kim, Dongha</creatorcontrib><description>Outlier detection (OD) is the task of identifying unusual observations (or
outliers) from a given or upcoming data by learning unique patterns of normal
observations (or inliers). Recently, a study introduced a powerful unsupervised
OD (UOD) solver based on a new observation of deep generative models, called
inlier-memorization (IM) effect, which suggests that generative models memorize
inliers before outliers in early learning stages. In this study, we aim to
develop a theoretically principled method to address UOD tasks by maximally
utilizing the IM effect. We begin by observing that the IM effect is observed
more clearly when the given training data contain fewer outliers. This finding
indicates a potential for enhancing the IM effect in UOD regimes if we can
effectively exclude outliers from mini-batches when designing the loss
function. To this end, we introduce two main techniques: 1) increasing the
mini-batch size as the model training proceeds and 2) using an adaptive
threshold to calculate the truncated loss function. We theoretically show that
these two techniques effectively filter out outliers from the truncated loss
function, allowing us to utilize the IM effect to the fullest. Coupled with an
additional ensemble strategy, we propose our method and term it Adaptive Loss
Truncation with Batch Increment (ALTBI). We provide extensive experimental
results to demonstrate that ALTBI achieves state-of-the-art performance in
identifying outliers compared to other recent methods, even with significantly
lower computation costs. Additionally, we show that our method yields robust
performances when combined with privacy-preserving algorithms.</description><identifier>DOI: 10.48550/arxiv.2408.09791</identifier><language>eng</language><subject>Computer Science - Learning ; Statistics - Machine Learning</subject><creationdate>2024-08</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2408.09791$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2408.09791$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Cho, Seoyoung</creatorcontrib><creatorcontrib>Hwang, Jaesung</creatorcontrib><creatorcontrib>Bak, Kwan-Young</creatorcontrib><creatorcontrib>Kim, Dongha</creatorcontrib><title>ALTBI: Constructing Improved Outlier Detection Models via Optimization of Inlier-Memorization Effect</title><description>Outlier detection (OD) is the task of identifying unusual observations (or
outliers) from a given or upcoming data by learning unique patterns of normal
observations (or inliers). Recently, a study introduced a powerful unsupervised
OD (UOD) solver based on a new observation of deep generative models, called
inlier-memorization (IM) effect, which suggests that generative models memorize
inliers before outliers in early learning stages. In this study, we aim to
develop a theoretically principled method to address UOD tasks by maximally
utilizing the IM effect. We begin by observing that the IM effect is observed
more clearly when the given training data contain fewer outliers. This finding
indicates a potential for enhancing the IM effect in UOD regimes if we can
effectively exclude outliers from mini-batches when designing the loss
function. To this end, we introduce two main techniques: 1) increasing the
mini-batch size as the model training proceeds and 2) using an adaptive
threshold to calculate the truncated loss function. We theoretically show that
these two techniques effectively filter out outliers from the truncated loss
function, allowing us to utilize the IM effect to the fullest. Coupled with an
additional ensemble strategy, we propose our method and term it Adaptive Loss
Truncation with Batch Increment (ALTBI). We provide extensive experimental
results to demonstrate that ALTBI achieves state-of-the-art performance in
identifying outliers compared to other recent methods, even with significantly
lower computation costs. Additionally, we show that our method yields robust
performances when combined with privacy-preserving algorithms.</description><subject>Computer Science - Learning</subject><subject>Statistics - Machine Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw0DOwNLc05GRIcfQJcfK0UnDOzysuKSpNLsnMS1fwzC0oyi9LTVHwLy3JyUwtUnBJLUkFSuXnKfjmp6TmFCuUZSYq-BeUZOZmViWCxfPTFDzzQGp1fVNz84tgwq5paUCNPAysaYk5xam8UJqbQd7NNcTZQxfsnviCoszcxKLKeJC74sHuMiasAgB7vkQd</recordid><startdate>20240819</startdate><enddate>20240819</enddate><creator>Cho, Seoyoung</creator><creator>Hwang, Jaesung</creator><creator>Bak, Kwan-Young</creator><creator>Kim, Dongha</creator><scope>AKY</scope><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20240819</creationdate><title>ALTBI: Constructing Improved Outlier Detection Models via Optimization of Inlier-Memorization Effect</title><author>Cho, Seoyoung ; Hwang, Jaesung ; Bak, Kwan-Young ; Kim, Dongha</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2408_097913</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Learning</topic><topic>Statistics - Machine Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Cho, Seoyoung</creatorcontrib><creatorcontrib>Hwang, Jaesung</creatorcontrib><creatorcontrib>Bak, Kwan-Young</creatorcontrib><creatorcontrib>Kim, Dongha</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Statistics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Cho, Seoyoung</au><au>Hwang, Jaesung</au><au>Bak, Kwan-Young</au><au>Kim, Dongha</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>ALTBI: Constructing Improved Outlier Detection Models via Optimization of Inlier-Memorization Effect</atitle><date>2024-08-19</date><risdate>2024</risdate><abstract>Outlier detection (OD) is the task of identifying unusual observations (or
outliers) from a given or upcoming data by learning unique patterns of normal
observations (or inliers). Recently, a study introduced a powerful unsupervised
OD (UOD) solver based on a new observation of deep generative models, called
inlier-memorization (IM) effect, which suggests that generative models memorize
inliers before outliers in early learning stages. In this study, we aim to
develop a theoretically principled method to address UOD tasks by maximally
utilizing the IM effect. We begin by observing that the IM effect is observed
more clearly when the given training data contain fewer outliers. This finding
indicates a potential for enhancing the IM effect in UOD regimes if we can
effectively exclude outliers from mini-batches when designing the loss
function. To this end, we introduce two main techniques: 1) increasing the
mini-batch size as the model training proceeds and 2) using an adaptive
threshold to calculate the truncated loss function. We theoretically show that
these two techniques effectively filter out outliers from the truncated loss
function, allowing us to utilize the IM effect to the fullest. Coupled with an
additional ensemble strategy, we propose our method and term it Adaptive Loss
Truncation with Batch Increment (ALTBI). We provide extensive experimental
results to demonstrate that ALTBI achieves state-of-the-art performance in
identifying outliers compared to other recent methods, even with significantly
lower computation costs. Additionally, we show that our method yields robust
performances when combined with privacy-preserving algorithms.</abstract><doi>10.48550/arxiv.2408.09791</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2408.09791 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2408_09791 |
source | arXiv.org |
subjects | Computer Science - Learning Statistics - Machine Learning |
title | ALTBI: Constructing Improved Outlier Detection Models via Optimization of Inlier-Memorization Effect |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-22T23%3A18%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=ALTBI:%20Constructing%20Improved%20Outlier%20Detection%20Models%20via%20Optimization%20of%20Inlier-Memorization%20Effect&rft.au=Cho,%20Seoyoung&rft.date=2024-08-19&rft_id=info:doi/10.48550/arxiv.2408.09791&rft_dat=%3Carxiv_GOX%3E2408_09791%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |