Unsupervised Detection of Anomalous Sound Based on Deep Learning and the Neyman-Pearson Lemma

This paper proposes a novel optimization principle and its implementation for unsupervised anomaly detection in sound (ADS) using an autoencoder (AE). The goal of the unsupervised-ADS is to detect unknown anomalous sounds without training data of anomalous sounds. The use of an AE as a normal model...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2019-01, Vol.27 (1), p.212-224
Hauptverfasser:	Koizumi, Yuma, Saito, Shoichiro, Uematsu, Hisashi, Kawachi, Yuta, Harada, Noboru
Format:	Artikel
Sprache:	eng
Schlagworte:	Acoustics and autoencoder Anomalies Anomaly detection in sound Computer simulation Deep learning Feature extraction Linear programming Neyman-Pearson lemma Probability density function Reconstruction Sound Speech processing State of the art Task analysis Training Training data
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	224
container_issue	1
container_start_page	212
container_title	IEEE/ACM transactions on audio, speech, and language processing
container_volume	27
creator	Koizumi, Yuma Saito, Shoichiro Uematsu, Hisashi Kawachi, Yuta Harada, Noboru
description	This paper proposes a novel optimization principle and its implementation for unsupervised anomaly detection in sound (ADS) using an autoencoder (AE). The goal of the unsupervised-ADS is to detect unknown anomalous sounds without training data of anomalous sounds. The use of an AE as a normal model is a state-of-the-art technique for the unsupervised-ADS. To decrease the false positive rate (FPR), the AE is trained to minimize the reconstruction error of normal sounds, and the anomaly score is calculated as the reconstruction error of the observed sound. Unfortunately, since this training procedure does not take into account the anomaly score for anomalous sounds, the true positive rate (TPR) does not necessarily increase. In this study, we define an objective function based on the Neyman-Pearson lemma by considering the ADS as a statistical hypothesis test. The proposed objective function trains the AE to maximize the TPR under an arbitrary low FPR condition. To calculate the TPR in the objective function, we consider that the set of anomalous sounds is the complementary set of normal sounds and simulate anomalous sounds by using a rejection sampling algorithm. Through experiments using synthetic data, we found that the proposed method improved the performance measures of the ADS under low FPR conditions. In addition, we confirmed that the proposed method could detect anomalous sounds in real environments.
doi_str_mv	10.1109/TASLP.2018.2877258
format	Article
fullrecord	<record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_8501554</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8501554</ieee_id><sourcerecordid>2141192581</sourcerecordid><originalsourceid>FETCH-LOGICAL-c405t-ebc294d850462e932aec4193f79c3452c2ca7e1cf0fd7202745183ab41163e683</originalsourceid><addsrcrecordid>eNo9kF1PwjAUhhujiQT5A3rTxOthv0bXSwS_kkVJgEvTlO5MR1g7182Ef28R9Oo0533etnkQuqZkTClRd6vpMl-MGaHZmGVSsjQ7QwPGmUoUJ-L878wUuUSjELaEEEqkUlIM0Pvahb6B9rsKUOA5dGC7yjvsSzx1vjY73we89L0r8L05IDGbAzQ4B9O6yn1gE6PuE_Ar7GvjkkXchwjlUNfmCl2UZhdgdJpDtH58WM2ek_zt6WU2zRMrSNolsLFMiSJLiZgwUJwZsIIqXkpluUiZZdZIoLYkZSEZYVKkNONmIyidcJhkfIhuj_c2rf_qIXR66_vWxSc1o5FS0QmNFDtStvUhtFDqpq1q0-41JfpgUv-a1AeT-mQylm6OpQoA_gvxqzRNBf8BplVuUA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2141192581</pqid></control><display><type>article</type><title>Unsupervised Detection of Anomalous Sound Based on Deep Learning and the Neyman-Pearson Lemma</title><source>ACM Digital Library Complete</source><source>IEEE Electronic Library (IEL)</source><creator>Koizumi, Yuma ; Saito, Shoichiro ; Uematsu, Hisashi ; Kawachi, Yuta ; Harada, Noboru</creator><creatorcontrib>Koizumi, Yuma ; Saito, Shoichiro ; Uematsu, Hisashi ; Kawachi, Yuta ; Harada, Noboru</creatorcontrib><description>This paper proposes a novel optimization principle and its implementation for unsupervised anomaly detection in sound (ADS) using an autoencoder (AE). The goal of the unsupervised-ADS is to detect unknown anomalous sounds without training data of anomalous sounds. The use of an AE as a normal model is a state-of-the-art technique for the unsupervised-ADS. To decrease the false positive rate (FPR), the AE is trained to minimize the reconstruction error of normal sounds, and the anomaly score is calculated as the reconstruction error of the observed sound. Unfortunately, since this training procedure does not take into account the anomaly score for anomalous sounds, the true positive rate (TPR) does not necessarily increase. In this study, we define an objective function based on the Neyman-Pearson lemma by considering the ADS as a statistical hypothesis test. The proposed objective function trains the AE to maximize the TPR under an arbitrary low FPR condition. To calculate the TPR in the objective function, we consider that the set of anomalous sounds is the complementary set of normal sounds and simulate anomalous sounds by using a rejection sampling algorithm. Through experiments using synthetic data, we found that the proposed method improved the performance measures of the ADS under low FPR conditions. In addition, we confirmed that the proposed method could detect anomalous sounds in real environments.</description><identifier>ISSN: 2329-9290</identifier><identifier>EISSN: 2329-9304</identifier><identifier>DOI: 10.1109/TASLP.2018.2877258</identifier><identifier>CODEN: ITASD8</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Acoustics ; and autoencoder ; Anomalies ; Anomaly detection in sound ; Computer simulation ; Deep learning ; Feature extraction ; Linear programming ; Neyman-Pearson lemma ; Probability density function ; Reconstruction ; Sound ; Speech processing ; State of the art ; Task analysis ; Training ; Training data</subject><ispartof>IEEE/ACM transactions on audio, speech, and language processing, 2019-01, Vol.27 (1), p.212-224</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c405t-ebc294d850462e932aec4193f79c3452c2ca7e1cf0fd7202745183ab41163e683</citedby><cites>FETCH-LOGICAL-c405t-ebc294d850462e932aec4193f79c3452c2ca7e1cf0fd7202745183ab41163e683</cites><orcidid>0000-0003-3645-6213</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8501554$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids></links><search><creatorcontrib>Koizumi, Yuma</creatorcontrib><creatorcontrib>Saito, Shoichiro</creatorcontrib><creatorcontrib>Uematsu, Hisashi</creatorcontrib><creatorcontrib>Kawachi, Yuta</creatorcontrib><creatorcontrib>Harada, Noboru</creatorcontrib><title>Unsupervised Detection of Anomalous Sound Based on Deep Learning and the Neyman-Pearson Lemma</title><title>IEEE/ACM transactions on audio, speech, and language processing</title><addtitle>TASLP</addtitle><description>This paper proposes a novel optimization principle and its implementation for unsupervised anomaly detection in sound (ADS) using an autoencoder (AE). The goal of the unsupervised-ADS is to detect unknown anomalous sounds without training data of anomalous sounds. The use of an AE as a normal model is a state-of-the-art technique for the unsupervised-ADS. To decrease the false positive rate (FPR), the AE is trained to minimize the reconstruction error of normal sounds, and the anomaly score is calculated as the reconstruction error of the observed sound. Unfortunately, since this training procedure does not take into account the anomaly score for anomalous sounds, the true positive rate (TPR) does not necessarily increase. In this study, we define an objective function based on the Neyman-Pearson lemma by considering the ADS as a statistical hypothesis test. The proposed objective function trains the AE to maximize the TPR under an arbitrary low FPR condition. To calculate the TPR in the objective function, we consider that the set of anomalous sounds is the complementary set of normal sounds and simulate anomalous sounds by using a rejection sampling algorithm. Through experiments using synthetic data, we found that the proposed method improved the performance measures of the ADS under low FPR conditions. In addition, we confirmed that the proposed method could detect anomalous sounds in real environments.</description><subject>Acoustics</subject><subject>and autoencoder</subject><subject>Anomalies</subject><subject>Anomaly detection in sound</subject><subject>Computer simulation</subject><subject>Deep learning</subject><subject>Feature extraction</subject><subject>Linear programming</subject><subject>Neyman-Pearson lemma</subject><subject>Probability density function</subject><subject>Reconstruction</subject><subject>Sound</subject><subject>Speech processing</subject><subject>State of the art</subject><subject>Task analysis</subject><subject>Training</subject><subject>Training data</subject><issn>2329-9290</issn><issn>2329-9304</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><recordid>eNo9kF1PwjAUhhujiQT5A3rTxOthv0bXSwS_kkVJgEvTlO5MR1g7182Ef28R9Oo0533etnkQuqZkTClRd6vpMl-MGaHZmGVSsjQ7QwPGmUoUJ-L878wUuUSjELaEEEqkUlIM0Pvahb6B9rsKUOA5dGC7yjvsSzx1vjY73we89L0r8L05IDGbAzQ4B9O6yn1gE6PuE_Ar7GvjkkXchwjlUNfmCl2UZhdgdJpDtH58WM2ek_zt6WU2zRMrSNolsLFMiSJLiZgwUJwZsIIqXkpluUiZZdZIoLYkZSEZYVKkNONmIyidcJhkfIhuj_c2rf_qIXR66_vWxSc1o5FS0QmNFDtStvUhtFDqpq1q0-41JfpgUv-a1AeT-mQylm6OpQoA_gvxqzRNBf8BplVuUA</recordid><startdate>201901</startdate><enddate>201901</enddate><creator>Koizumi, Yuma</creator><creator>Saito, Shoichiro</creator><creator>Uematsu, Hisashi</creator><creator>Kawachi, Yuta</creator><creator>Harada, Noboru</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-3645-6213</orcidid></search><sort><creationdate>201901</creationdate><title>Unsupervised Detection of Anomalous Sound Based on Deep Learning and the Neyman-Pearson Lemma</title><author>Koizumi, Yuma ; Saito, Shoichiro ; Uematsu, Hisashi ; Kawachi, Yuta ; Harada, Noboru</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c405t-ebc294d850462e932aec4193f79c3452c2ca7e1cf0fd7202745183ab41163e683</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Acoustics</topic><topic>and autoencoder</topic><topic>Anomalies</topic><topic>Anomaly detection in sound</topic><topic>Computer simulation</topic><topic>Deep learning</topic><topic>Feature extraction</topic><topic>Linear programming</topic><topic>Neyman-Pearson lemma</topic><topic>Probability density function</topic><topic>Reconstruction</topic><topic>Sound</topic><topic>Speech processing</topic><topic>State of the art</topic><topic>Task analysis</topic><topic>Training</topic><topic>Training data</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Koizumi, Yuma</creatorcontrib><creatorcontrib>Saito, Shoichiro</creatorcontrib><creatorcontrib>Uematsu, Hisashi</creatorcontrib><creatorcontrib>Kawachi, Yuta</creatorcontrib><creatorcontrib>Harada, Noboru</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE/ACM transactions on audio, speech, and language processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Koizumi, Yuma</au><au>Saito, Shoichiro</au><au>Uematsu, Hisashi</au><au>Kawachi, Yuta</au><au>Harada, Noboru</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Unsupervised Detection of Anomalous Sound Based on Deep Learning and the Neyman-Pearson Lemma</atitle><jtitle>IEEE/ACM transactions on audio, speech, and language processing</jtitle><stitle>TASLP</stitle><date>2019-01</date><risdate>2019</risdate><volume>27</volume><issue>1</issue><spage>212</spage><epage>224</epage><pages>212-224</pages><issn>2329-9290</issn><eissn>2329-9304</eissn><coden>ITASD8</coden><abstract>This paper proposes a novel optimization principle and its implementation for unsupervised anomaly detection in sound (ADS) using an autoencoder (AE). The goal of the unsupervised-ADS is to detect unknown anomalous sounds without training data of anomalous sounds. The use of an AE as a normal model is a state-of-the-art technique for the unsupervised-ADS. To decrease the false positive rate (FPR), the AE is trained to minimize the reconstruction error of normal sounds, and the anomaly score is calculated as the reconstruction error of the observed sound. Unfortunately, since this training procedure does not take into account the anomaly score for anomalous sounds, the true positive rate (TPR) does not necessarily increase. In this study, we define an objective function based on the Neyman-Pearson lemma by considering the ADS as a statistical hypothesis test. The proposed objective function trains the AE to maximize the TPR under an arbitrary low FPR condition. To calculate the TPR in the objective function, we consider that the set of anomalous sounds is the complementary set of normal sounds and simulate anomalous sounds by using a rejection sampling algorithm. Through experiments using synthetic data, we found that the proposed method improved the performance measures of the ADS under low FPR conditions. In addition, we confirmed that the proposed method could detect anomalous sounds in real environments.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TASLP.2018.2877258</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0003-3645-6213</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2329-9290
ispartof	IEEE/ACM transactions on audio, speech, and language processing, 2019-01, Vol.27 (1), p.212-224
issn	2329-9290 2329-9304
language	eng
recordid	cdi_ieee_primary_8501554
source	ACM Digital Library Complete; IEEE Electronic Library (IEL)
subjects	Acoustics and autoencoder Anomalies Anomaly detection in sound Computer simulation Deep learning Feature extraction Linear programming Neyman-Pearson lemma Probability density function Reconstruction Sound Speech processing State of the art Task analysis Training Training data
title	Unsupervised Detection of Anomalous Sound Based on Deep Learning and the Neyman-Pearson Lemma
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T12%3A22%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Unsupervised%20Detection%20of%20Anomalous%20Sound%20Based%20on%20Deep%20Learning%20and%20the%20Neyman-Pearson%20Lemma&rft.jtitle=IEEE/ACM%20transactions%20on%20audio,%20speech,%20and%20language%20processing&rft.au=Koizumi,%20Yuma&rft.date=2019-01&rft.volume=27&rft.issue=1&rft.spage=212&rft.epage=224&rft.pages=212-224&rft.issn=2329-9290&rft.eissn=2329-9304&rft.coden=ITASD8&rft_id=info:doi/10.1109/TASLP.2018.2877258&rft_dat=%3Cproquest_ieee_%3E2141192581%3C/proquest_ieee_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2141192581&rft_id=info:pmid/&rft_ieee_id=8501554&rfr_iscdi=true