FRS: Adaptive Score for Improving Acoustic Source Classification from Noisy Signals

This paper proposes a frame relevance score to improve the classification of environmental acoustic sources from noisy speech signals. The importance of each short-time frame for the classification results is objectively interpreted by SHAP values. The SHAP-based frame relevance score enables the se...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE signal processing letters 2024-01, Vol.31, p.1-5
Hauptverfasser:	Marinati, R., Coelho, R., Zao, L.
Format:	Artikel
Sprache:	eng
Schlagworte:	acoustic source classification Acoustics Background noise Classification Convolutional neural networks Mel frequency cepstral coefficient Neural networks Noise measurement noisy speech signals Signal classification Sound sources Speech Speech enhancement Speech recognition surrogates Training
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	5
container_issue
container_start_page	1
container_title	IEEE signal processing letters
container_volume	31
creator	Marinati, R. Coelho, R. Zao, L.
description	This paper proposes a frame relevance score to improve the classification of environmental acoustic sources from noisy speech signals. The importance of each short-time frame for the classification results is objectively interpreted by SHAP values. The SHAP-based frame relevance score enables the selection of frames that are more appropriate to improve the discrimination power of the acoustic models. The frame selection can be used as a pre-training strategy to any classification strategy. Evaluation experiments consider the recognition of ten background sources from noisy speech signals. The classical approach based on MFCC and GMM is adopted to prove that the selected frames can better distinguish the acoustic classes. Moreover, the frame selection outperforms a surrogate-based adaptive learning solution. Experiments are also conducted with a recently proposed pre-trained neural network that achieves high classification rates. The proposed SHAP-based selection shows improved classification accuracies even for this scenario.
doi_str_mv	10.1109/LSP.2024.3358097
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_10413560</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10413560</ieee_id><sourcerecordid>2933610692</sourcerecordid><originalsourceid>FETCH-LOGICAL-c245t-c79a5cc8eb831fd99ab376884ed08702334f79db72db53025332219af469fa683</originalsourceid><addsrcrecordid>eNpNkEFPAjEUhBujiYjePXho4nnxtd3utt4IESUhalw9N91uS0qAYruQ8O8tgYOneYeZybwPoXsCI0JAPs2bzxEFWo4Y4wJkfYEGhHNRUFaRy3xDDYWUIK7RTUpLABBE8AFqpl_NMx53etv7vcWNCdFiFyKerbcx7P1mgccm7FLvDW7CLhqLJyudknfe6N6HDXYxrPF78OmAG7_Y6FW6RVcui7076xD9TF--J2_F_ON1NhnPC0NL3hemlpobI2wrGHGdlLpldSVEaTsQNVDGSlfLrq1p13IGlDNGKZHalZV0uhJsiB5PvXnp786mXi3zwuMCRSXLf0MlaXbByWViSClap7bRr3U8KALqiE5ldOqITp3R5cjDKeKttf_sJWG8AvYHd9dpRg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2933610692</pqid></control><display><type>article</type><title>FRS: Adaptive Score for Improving Acoustic Source Classification from Noisy Signals</title><source>IEEE Electronic Library (IEL)</source><creator>Marinati, R. ; Coelho, R. ; Zao, L.</creator><creatorcontrib>Marinati, R. ; Coelho, R. ; Zao, L.</creatorcontrib><description>This paper proposes a frame relevance score to improve the classification of environmental acoustic sources from noisy speech signals. The importance of each short-time frame for the classification results is objectively interpreted by SHAP values. The SHAP-based frame relevance score enables the selection of frames that are more appropriate to improve the discrimination power of the acoustic models. The frame selection can be used as a pre-training strategy to any classification strategy. Evaluation experiments consider the recognition of ten background sources from noisy speech signals. The classical approach based on MFCC and GMM is adopted to prove that the selected frames can better distinguish the acoustic classes. Moreover, the frame selection outperforms a surrogate-based adaptive learning solution. Experiments are also conducted with a recently proposed pre-trained neural network that achieves high classification rates. The proposed SHAP-based selection shows improved classification accuracies even for this scenario.</description><identifier>ISSN: 1070-9908</identifier><identifier>EISSN: 1558-2361</identifier><identifier>DOI: 10.1109/LSP.2024.3358097</identifier><identifier>CODEN: ISPLEM</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>acoustic source classification ; Acoustics ; Background noise ; Classification ; Convolutional neural networks ; Mel frequency cepstral coefficient ; Neural networks ; Noise measurement ; noisy speech signals ; Signal classification ; Sound sources ; Speech ; Speech enhancement ; Speech recognition ; surrogates ; Training</subject><ispartof>IEEE signal processing letters, 2024-01, Vol.31, p.1-5</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c245t-c79a5cc8eb831fd99ab376884ed08702334f79db72db53025332219af469fa683</cites><orcidid>0000-0002-6438-9380 ; 0000-0001-8662-7522 ; 0000-0002-8170-3992</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10413560$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10413560$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Marinati, R.</creatorcontrib><creatorcontrib>Coelho, R.</creatorcontrib><creatorcontrib>Zao, L.</creatorcontrib><title>FRS: Adaptive Score for Improving Acoustic Source Classification from Noisy Signals</title><title>IEEE signal processing letters</title><addtitle>LSP</addtitle><description>This paper proposes a frame relevance score to improve the classification of environmental acoustic sources from noisy speech signals. The importance of each short-time frame for the classification results is objectively interpreted by SHAP values. The SHAP-based frame relevance score enables the selection of frames that are more appropriate to improve the discrimination power of the acoustic models. The frame selection can be used as a pre-training strategy to any classification strategy. Evaluation experiments consider the recognition of ten background sources from noisy speech signals. The classical approach based on MFCC and GMM is adopted to prove that the selected frames can better distinguish the acoustic classes. Moreover, the frame selection outperforms a surrogate-based adaptive learning solution. Experiments are also conducted with a recently proposed pre-trained neural network that achieves high classification rates. The proposed SHAP-based selection shows improved classification accuracies even for this scenario.</description><subject>acoustic source classification</subject><subject>Acoustics</subject><subject>Background noise</subject><subject>Classification</subject><subject>Convolutional neural networks</subject><subject>Mel frequency cepstral coefficient</subject><subject>Neural networks</subject><subject>Noise measurement</subject><subject>noisy speech signals</subject><subject>Signal classification</subject><subject>Sound sources</subject><subject>Speech</subject><subject>Speech enhancement</subject><subject>Speech recognition</subject><subject>surrogates</subject><subject>Training</subject><issn>1070-9908</issn><issn>1558-2361</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkEFPAjEUhBujiYjePXho4nnxtd3utt4IESUhalw9N91uS0qAYruQ8O8tgYOneYeZybwPoXsCI0JAPs2bzxEFWo4Y4wJkfYEGhHNRUFaRy3xDDYWUIK7RTUpLABBE8AFqpl_NMx53etv7vcWNCdFiFyKerbcx7P1mgccm7FLvDW7CLhqLJyudknfe6N6HDXYxrPF78OmAG7_Y6FW6RVcui7076xD9TF--J2_F_ON1NhnPC0NL3hemlpobI2wrGHGdlLpldSVEaTsQNVDGSlfLrq1p13IGlDNGKZHalZV0uhJsiB5PvXnp786mXi3zwuMCRSXLf0MlaXbByWViSClap7bRr3U8KALqiE5ldOqITp3R5cjDKeKttf_sJWG8AvYHd9dpRg</recordid><startdate>20240101</startdate><enddate>20240101</enddate><creator>Marinati, R.</creator><creator>Coelho, R.</creator><creator>Zao, L.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-6438-9380</orcidid><orcidid>https://orcid.org/0000-0001-8662-7522</orcidid><orcidid>https://orcid.org/0000-0002-8170-3992</orcidid></search><sort><creationdate>20240101</creationdate><title>FRS: Adaptive Score for Improving Acoustic Source Classification from Noisy Signals</title><author>Marinati, R. ; Coelho, R. ; Zao, L.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c245t-c79a5cc8eb831fd99ab376884ed08702334f79db72db53025332219af469fa683</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>acoustic source classification</topic><topic>Acoustics</topic><topic>Background noise</topic><topic>Classification</topic><topic>Convolutional neural networks</topic><topic>Mel frequency cepstral coefficient</topic><topic>Neural networks</topic><topic>Noise measurement</topic><topic>noisy speech signals</topic><topic>Signal classification</topic><topic>Sound sources</topic><topic>Speech</topic><topic>Speech enhancement</topic><topic>Speech recognition</topic><topic>surrogates</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Marinati, R.</creatorcontrib><creatorcontrib>Coelho, R.</creatorcontrib><creatorcontrib>Zao, L.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE signal processing letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Marinati, R.</au><au>Coelho, R.</au><au>Zao, L.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>FRS: Adaptive Score for Improving Acoustic Source Classification from Noisy Signals</atitle><jtitle>IEEE signal processing letters</jtitle><stitle>LSP</stitle><date>2024-01-01</date><risdate>2024</risdate><volume>31</volume><spage>1</spage><epage>5</epage><pages>1-5</pages><issn>1070-9908</issn><eissn>1558-2361</eissn><coden>ISPLEM</coden><abstract>This paper proposes a frame relevance score to improve the classification of environmental acoustic sources from noisy speech signals. The importance of each short-time frame for the classification results is objectively interpreted by SHAP values. The SHAP-based frame relevance score enables the selection of frames that are more appropriate to improve the discrimination power of the acoustic models. The frame selection can be used as a pre-training strategy to any classification strategy. Evaluation experiments consider the recognition of ten background sources from noisy speech signals. The classical approach based on MFCC and GMM is adopted to prove that the selected frames can better distinguish the acoustic classes. Moreover, the frame selection outperforms a surrogate-based adaptive learning solution. Experiments are also conducted with a recently proposed pre-trained neural network that achieves high classification rates. The proposed SHAP-based selection shows improved classification accuracies even for this scenario.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/LSP.2024.3358097</doi><tpages>5</tpages><orcidid>https://orcid.org/0000-0002-6438-9380</orcidid><orcidid>https://orcid.org/0000-0001-8662-7522</orcidid><orcidid>https://orcid.org/0000-0002-8170-3992</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1070-9908
ispartof	IEEE signal processing letters, 2024-01, Vol.31, p.1-5
issn	1070-9908 1558-2361
language	eng
recordid	cdi_ieee_primary_10413560
source	IEEE Electronic Library (IEL)
subjects	acoustic source classification Acoustics Background noise Classification Convolutional neural networks Mel frequency cepstral coefficient Neural networks Noise measurement noisy speech signals Signal classification Sound sources Speech Speech enhancement Speech recognition surrogates Training
title	FRS: Adaptive Score for Improving Acoustic Source Classification from Noisy Signals
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-22T22%3A24%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=FRS:%20Adaptive%20Score%20for%20Improving%20Acoustic%20Source%20Classification%20from%20Noisy%20Signals&rft.jtitle=IEEE%20signal%20processing%20letters&rft.au=Marinati,%20R.&rft.date=2024-01-01&rft.volume=31&rft.spage=1&rft.epage=5&rft.pages=1-5&rft.issn=1070-9908&rft.eissn=1558-2361&rft.coden=ISPLEM&rft_id=info:doi/10.1109/LSP.2024.3358097&rft_dat=%3Cproquest_RIE%3E2933610692%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2933610692&rft_id=info:pmid/&rft_ieee_id=10413560&rfr_iscdi=true