Deep Learning Based Open Set Acoustic Scene Classification

In this work, we compare the performance of three selected techniques in open set acoustic scenes classification (ASC). We test thresholding of the softmax output of a deep network classifier, which is the most popular technique nowadays employed in ASC. Further we compare the results with the Openm...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Kwiatkowska, Zuzanna, Kalinowski, Beniamin, Kośmider, Michał, Rykaczewski, Krzysztof
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Learning Computer Science - Sound
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Kwiatkowska, Zuzanna Kalinowski, Beniamin Kośmider, Michał Rykaczewski, Krzysztof
description	In this work, we compare the performance of three selected techniques in open set acoustic scenes classification (ASC). We test thresholding of the softmax output of a deep network classifier, which is the most popular technique nowadays employed in ASC. Further we compare the results with the Openmax classifier which is derived from the computer vision field. As the third model, we use the Adapted Class-Conditioned Autoencoder (Adapted C2AE) which is our variation of another computer vision related technique called C2AE. Adapted C2AE encompasses a more fair comparison of the given experiments and simplifies the original inference procedure, making it more applicable in the real-life scenarios. We also analyse two training scenarios: without additional knowledge of unknown classes and another where a limited subset of examples from the unknown classes is available. We find that the C2AE based method outperforms the thresholding and Openmax, obtaining $85.5\%$ Area Under the Receiver Operating Characteristic curve (AUROC) and $66\%$ of open set accuracy on data used in Detection and Classification of Acoustic Scenes and Events Challenge 2019 Task 1C.
doi_str_mv	10.48550/arxiv.2008.07247
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2008_07247</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2008_07247</sourcerecordid><originalsourceid>FETCH-LOGICAL-a677-5140fe09b368b3a4d56753b853cef287ce1ef6d515de747f87da7b8eb43677ce3</originalsourceid><addsrcrecordid>eNotj7FOwzAURb10qAof0An_QIIT23kuWxugVIrUod2jZ_sZWWrdKA4I_h5omc50j-5hbFmJUhmtxSOOX_GzrIUwpYBawZw9PRMNvCMcU0zvfIOZPN8PlPiBJr52l488RccPjhLx9oQ5xxAdTvGS7tgs4CnT_T8X7Pj6cmzfim6_3bXrrsAGoNCVEoHEysrGWInK6wa0tEZLR6E24Kii0HhdaU-gIBjwCNaQVfJ37kgu2MNNez3fD2M84_jd_0X01wj5A6iyQRE</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Deep Learning Based Open Set Acoustic Scene Classification</title><source>arXiv.org</source><creator>Kwiatkowska, Zuzanna ; Kalinowski, Beniamin ; Kośmider, Michał ; Rykaczewski, Krzysztof</creator><creatorcontrib>Kwiatkowska, Zuzanna ; Kalinowski, Beniamin ; Kośmider, Michał ; Rykaczewski, Krzysztof</creatorcontrib><description>In this work, we compare the performance of three selected techniques in open set acoustic scenes classification (ASC). We test thresholding of the softmax output of a deep network classifier, which is the most popular technique nowadays employed in ASC. Further we compare the results with the Openmax classifier which is derived from the computer vision field. As the third model, we use the Adapted Class-Conditioned Autoencoder (Adapted C2AE) which is our variation of another computer vision related technique called C2AE. Adapted C2AE encompasses a more fair comparison of the given experiments and simplifies the original inference procedure, making it more applicable in the real-life scenarios. We also analyse two training scenarios: without additional knowledge of unknown classes and another where a limited subset of examples from the unknown classes is available. We find that the C2AE based method outperforms the thresholding and Openmax, obtaining $85.5\%$ Area Under the Receiver Operating Characteristic curve (AUROC) and $66\%$ of open set accuracy on data used in Detection and Classification of Acoustic Scenes and Events Challenge 2019 Task 1C.</description><identifier>DOI: 10.48550/arxiv.2008.07247</identifier><language>eng</language><subject>Computer Science - Learning ; Computer Science - Sound</subject><creationdate>2020-08</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2008.07247$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2008.07247$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Kwiatkowska, Zuzanna</creatorcontrib><creatorcontrib>Kalinowski, Beniamin</creatorcontrib><creatorcontrib>Kośmider, Michał</creatorcontrib><creatorcontrib>Rykaczewski, Krzysztof</creatorcontrib><title>Deep Learning Based Open Set Acoustic Scene Classification</title><description>In this work, we compare the performance of three selected techniques in open set acoustic scenes classification (ASC). We test thresholding of the softmax output of a deep network classifier, which is the most popular technique nowadays employed in ASC. Further we compare the results with the Openmax classifier which is derived from the computer vision field. As the third model, we use the Adapted Class-Conditioned Autoencoder (Adapted C2AE) which is our variation of another computer vision related technique called C2AE. Adapted C2AE encompasses a more fair comparison of the given experiments and simplifies the original inference procedure, making it more applicable in the real-life scenarios. We also analyse two training scenarios: without additional knowledge of unknown classes and another where a limited subset of examples from the unknown classes is available. We find that the C2AE based method outperforms the thresholding and Openmax, obtaining $85.5\%$ Area Under the Receiver Operating Characteristic curve (AUROC) and $66\%$ of open set accuracy on data used in Detection and Classification of Acoustic Scenes and Events Challenge 2019 Task 1C.</description><subject>Computer Science - Learning</subject><subject>Computer Science - Sound</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj7FOwzAURb10qAof0An_QIIT23kuWxugVIrUod2jZ_sZWWrdKA4I_h5omc50j-5hbFmJUhmtxSOOX_GzrIUwpYBawZw9PRMNvCMcU0zvfIOZPN8PlPiBJr52l488RccPjhLx9oQ5xxAdTvGS7tgs4CnT_T8X7Pj6cmzfim6_3bXrrsAGoNCVEoHEysrGWInK6wa0tEZLR6E24Kii0HhdaU-gIBjwCNaQVfJ37kgu2MNNez3fD2M84_jd_0X01wj5A6iyQRE</recordid><startdate>20200817</startdate><enddate>20200817</enddate><creator>Kwiatkowska, Zuzanna</creator><creator>Kalinowski, Beniamin</creator><creator>Kośmider, Michał</creator><creator>Rykaczewski, Krzysztof</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20200817</creationdate><title>Deep Learning Based Open Set Acoustic Scene Classification</title><author>Kwiatkowska, Zuzanna ; Kalinowski, Beniamin ; Kośmider, Michał ; Rykaczewski, Krzysztof</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a677-5140fe09b368b3a4d56753b853cef287ce1ef6d515de747f87da7b8eb43677ce3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer Science - Learning</topic><topic>Computer Science - Sound</topic><toplevel>online_resources</toplevel><creatorcontrib>Kwiatkowska, Zuzanna</creatorcontrib><creatorcontrib>Kalinowski, Beniamin</creatorcontrib><creatorcontrib>Kośmider, Michał</creatorcontrib><creatorcontrib>Rykaczewski, Krzysztof</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kwiatkowska, Zuzanna</au><au>Kalinowski, Beniamin</au><au>Kośmider, Michał</au><au>Rykaczewski, Krzysztof</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep Learning Based Open Set Acoustic Scene Classification</atitle><date>2020-08-17</date><risdate>2020</risdate><abstract>In this work, we compare the performance of three selected techniques in open set acoustic scenes classification (ASC). We test thresholding of the softmax output of a deep network classifier, which is the most popular technique nowadays employed in ASC. Further we compare the results with the Openmax classifier which is derived from the computer vision field. As the third model, we use the Adapted Class-Conditioned Autoencoder (Adapted C2AE) which is our variation of another computer vision related technique called C2AE. Adapted C2AE encompasses a more fair comparison of the given experiments and simplifies the original inference procedure, making it more applicable in the real-life scenarios. We also analyse two training scenarios: without additional knowledge of unknown classes and another where a limited subset of examples from the unknown classes is available. We find that the C2AE based method outperforms the thresholding and Openmax, obtaining $85.5\%$ Area Under the Receiver Operating Characteristic curve (AUROC) and $66\%$ of open set accuracy on data used in Detection and Classification of Acoustic Scenes and Events Challenge 2019 Task 1C.</abstract><doi>10.48550/arxiv.2008.07247</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2008.07247
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2008_07247
source	arXiv.org
subjects	Computer Science - Learning Computer Science - Sound
title	Deep Learning Based Open Set Acoustic Scene Classification
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T22%3A22%3A30IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep%20Learning%20Based%20Open%20Set%20Acoustic%20Scene%20Classification&rft.au=Kwiatkowska,%20Zuzanna&rft.date=2020-08-17&rft_id=info:doi/10.48550/arxiv.2008.07247&rft_dat=%3Carxiv_GOX%3E2008_07247%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true