FRS: Adaptive Score for Improving Acoustic Source Classification from Noisy Signals

This paper proposes a frame relevance score to improve the classification of environmental acoustic sources from noisy speech signals. The importance of each short-time frame for the classification results is objectively interpreted by SHAP values. The SHAP-based frame relevance score enables the se...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE signal processing letters 2024-01, Vol.31, p.1-5
Hauptverfasser: Marinati, R., Coelho, R., Zao, L.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 5
container_issue
container_start_page 1
container_title IEEE signal processing letters
container_volume 31
creator Marinati, R.
Coelho, R.
Zao, L.
description This paper proposes a frame relevance score to improve the classification of environmental acoustic sources from noisy speech signals. The importance of each short-time frame for the classification results is objectively interpreted by SHAP values. The SHAP-based frame relevance score enables the selection of frames that are more appropriate to improve the discrimination power of the acoustic models. The frame selection can be used as a pre-training strategy to any classification strategy. Evaluation experiments consider the recognition of ten background sources from noisy speech signals. The classical approach based on MFCC and GMM is adopted to prove that the selected frames can better distinguish the acoustic classes. Moreover, the frame selection outperforms a surrogate-based adaptive learning solution. Experiments are also conducted with a recently proposed pre-trained neural network that achieves high classification rates. The proposed SHAP-based selection shows improved classification accuracies even for this scenario.
doi_str_mv 10.1109/LSP.2024.3358097
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_10413560</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10413560</ieee_id><sourcerecordid>2933610692</sourcerecordid><originalsourceid>FETCH-LOGICAL-c245t-c79a5cc8eb831fd99ab376884ed08702334f79db72db53025332219af469fa683</originalsourceid><addsrcrecordid>eNpNkEFPAjEUhBujiYjePXho4nnxtd3utt4IESUhalw9N91uS0qAYruQ8O8tgYOneYeZybwPoXsCI0JAPs2bzxEFWo4Y4wJkfYEGhHNRUFaRy3xDDYWUIK7RTUpLABBE8AFqpl_NMx53etv7vcWNCdFiFyKerbcx7P1mgccm7FLvDW7CLhqLJyudknfe6N6HDXYxrPF78OmAG7_Y6FW6RVcui7076xD9TF--J2_F_ON1NhnPC0NL3hemlpobI2wrGHGdlLpldSVEaTsQNVDGSlfLrq1p13IGlDNGKZHalZV0uhJsiB5PvXnp786mXi3zwuMCRSXLf0MlaXbByWViSClap7bRr3U8KALqiE5ldOqITp3R5cjDKeKttf_sJWG8AvYHd9dpRg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2933610692</pqid></control><display><type>article</type><title>FRS: Adaptive Score for Improving Acoustic Source Classification from Noisy Signals</title><source>IEEE Electronic Library (IEL)</source><creator>Marinati, R. ; Coelho, R. ; Zao, L.</creator><creatorcontrib>Marinati, R. ; Coelho, R. ; Zao, L.</creatorcontrib><description>This paper proposes a frame relevance score to improve the classification of environmental acoustic sources from noisy speech signals. The importance of each short-time frame for the classification results is objectively interpreted by SHAP values. The SHAP-based frame relevance score enables the selection of frames that are more appropriate to improve the discrimination power of the acoustic models. The frame selection can be used as a pre-training strategy to any classification strategy. Evaluation experiments consider the recognition of ten background sources from noisy speech signals. The classical approach based on MFCC and GMM is adopted to prove that the selected frames can better distinguish the acoustic classes. Moreover, the frame selection outperforms a surrogate-based adaptive learning solution. Experiments are also conducted with a recently proposed pre-trained neural network that achieves high classification rates. The proposed SHAP-based selection shows improved classification accuracies even for this scenario.</description><identifier>ISSN: 1070-9908</identifier><identifier>EISSN: 1558-2361</identifier><identifier>DOI: 10.1109/LSP.2024.3358097</identifier><identifier>CODEN: ISPLEM</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>acoustic source classification ; Acoustics ; Background noise ; Classification ; Convolutional neural networks ; Mel frequency cepstral coefficient ; Neural networks ; Noise measurement ; noisy speech signals ; Signal classification ; Sound sources ; Speech ; Speech enhancement ; Speech recognition ; surrogates ; Training</subject><ispartof>IEEE signal processing letters, 2024-01, Vol.31, p.1-5</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c245t-c79a5cc8eb831fd99ab376884ed08702334f79db72db53025332219af469fa683</cites><orcidid>0000-0002-6438-9380 ; 0000-0001-8662-7522 ; 0000-0002-8170-3992</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10413560$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10413560$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Marinati, R.</creatorcontrib><creatorcontrib>Coelho, R.</creatorcontrib><creatorcontrib>Zao, L.</creatorcontrib><title>FRS: Adaptive Score for Improving Acoustic Source Classification from Noisy Signals</title><title>IEEE signal processing letters</title><addtitle>LSP</addtitle><description>This paper proposes a frame relevance score to improve the classification of environmental acoustic sources from noisy speech signals. The importance of each short-time frame for the classification results is objectively interpreted by SHAP values. The SHAP-based frame relevance score enables the selection of frames that are more appropriate to improve the discrimination power of the acoustic models. The frame selection can be used as a pre-training strategy to any classification strategy. Evaluation experiments consider the recognition of ten background sources from noisy speech signals. The classical approach based on MFCC and GMM is adopted to prove that the selected frames can better distinguish the acoustic classes. Moreover, the frame selection outperforms a surrogate-based adaptive learning solution. Experiments are also conducted with a recently proposed pre-trained neural network that achieves high classification rates. The proposed SHAP-based selection shows improved classification accuracies even for this scenario.</description><subject>acoustic source classification</subject><subject>Acoustics</subject><subject>Background noise</subject><subject>Classification</subject><subject>Convolutional neural networks</subject><subject>Mel frequency cepstral coefficient</subject><subject>Neural networks</subject><subject>Noise measurement</subject><subject>noisy speech signals</subject><subject>Signal classification</subject><subject>Sound sources</subject><subject>Speech</subject><subject>Speech enhancement</subject><subject>Speech recognition</subject><subject>surrogates</subject><subject>Training</subject><issn>1070-9908</issn><issn>1558-2361</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkEFPAjEUhBujiYjePXho4nnxtd3utt4IESUhalw9N91uS0qAYruQ8O8tgYOneYeZybwPoXsCI0JAPs2bzxEFWo4Y4wJkfYEGhHNRUFaRy3xDDYWUIK7RTUpLABBE8AFqpl_NMx53etv7vcWNCdFiFyKerbcx7P1mgccm7FLvDW7CLhqLJyudknfe6N6HDXYxrPF78OmAG7_Y6FW6RVcui7076xD9TF--J2_F_ON1NhnPC0NL3hemlpobI2wrGHGdlLpldSVEaTsQNVDGSlfLrq1p13IGlDNGKZHalZV0uhJsiB5PvXnp786mXi3zwuMCRSXLf0MlaXbByWViSClap7bRr3U8KALqiE5ldOqITp3R5cjDKeKttf_sJWG8AvYHd9dpRg</recordid><startdate>20240101</startdate><enddate>20240101</enddate><creator>Marinati, R.</creator><creator>Coelho, R.</creator><creator>Zao, L.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-6438-9380</orcidid><orcidid>https://orcid.org/0000-0001-8662-7522</orcidid><orcidid>https://orcid.org/0000-0002-8170-3992</orcidid></search><sort><creationdate>20240101</creationdate><title>FRS: Adaptive Score for Improving Acoustic Source Classification from Noisy Signals</title><author>Marinati, R. ; Coelho, R. ; Zao, L.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c245t-c79a5cc8eb831fd99ab376884ed08702334f79db72db53025332219af469fa683</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>acoustic source classification</topic><topic>Acoustics</topic><topic>Background noise</topic><topic>Classification</topic><topic>Convolutional neural networks</topic><topic>Mel frequency cepstral coefficient</topic><topic>Neural networks</topic><topic>Noise measurement</topic><topic>noisy speech signals</topic><topic>Signal classification</topic><topic>Sound sources</topic><topic>Speech</topic><topic>Speech enhancement</topic><topic>Speech recognition</topic><topic>surrogates</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Marinati, R.</creatorcontrib><creatorcontrib>Coelho, R.</creatorcontrib><creatorcontrib>Zao, L.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE signal processing letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Marinati, R.</au><au>Coelho, R.</au><au>Zao, L.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>FRS: Adaptive Score for Improving Acoustic Source Classification from Noisy Signals</atitle><jtitle>IEEE signal processing letters</jtitle><stitle>LSP</stitle><date>2024-01-01</date><risdate>2024</risdate><volume>31</volume><spage>1</spage><epage>5</epage><pages>1-5</pages><issn>1070-9908</issn><eissn>1558-2361</eissn><coden>ISPLEM</coden><abstract>This paper proposes a frame relevance score to improve the classification of environmental acoustic sources from noisy speech signals. The importance of each short-time frame for the classification results is objectively interpreted by SHAP values. The SHAP-based frame relevance score enables the selection of frames that are more appropriate to improve the discrimination power of the acoustic models. The frame selection can be used as a pre-training strategy to any classification strategy. Evaluation experiments consider the recognition of ten background sources from noisy speech signals. The classical approach based on MFCC and GMM is adopted to prove that the selected frames can better distinguish the acoustic classes. Moreover, the frame selection outperforms a surrogate-based adaptive learning solution. Experiments are also conducted with a recently proposed pre-trained neural network that achieves high classification rates. The proposed SHAP-based selection shows improved classification accuracies even for this scenario.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/LSP.2024.3358097</doi><tpages>5</tpages><orcidid>https://orcid.org/0000-0002-6438-9380</orcidid><orcidid>https://orcid.org/0000-0001-8662-7522</orcidid><orcidid>https://orcid.org/0000-0002-8170-3992</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1070-9908
ispartof IEEE signal processing letters, 2024-01, Vol.31, p.1-5
issn 1070-9908
1558-2361
language eng
recordid cdi_ieee_primary_10413560
source IEEE Electronic Library (IEL)
subjects acoustic source classification
Acoustics
Background noise
Classification
Convolutional neural networks
Mel frequency cepstral coefficient
Neural networks
Noise measurement
noisy speech signals
Signal classification
Sound sources
Speech
Speech enhancement
Speech recognition
surrogates
Training
title FRS: Adaptive Score for Improving Acoustic Source Classification from Noisy Signals
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-22T22%3A24%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=FRS:%20Adaptive%20Score%20for%20Improving%20Acoustic%20Source%20Classification%20from%20Noisy%20Signals&rft.jtitle=IEEE%20signal%20processing%20letters&rft.au=Marinati,%20R.&rft.date=2024-01-01&rft.volume=31&rft.spage=1&rft.epage=5&rft.pages=1-5&rft.issn=1070-9908&rft.eissn=1558-2361&rft.coden=ISPLEM&rft_id=info:doi/10.1109/LSP.2024.3358097&rft_dat=%3Cproquest_RIE%3E2933610692%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2933610692&rft_id=info:pmid/&rft_ieee_id=10413560&rfr_iscdi=true