DOA Estimation for Multiple Speech Sources Based on Flexible Single-Source Zones and Concentration Weighting

Direction of arrival (DOA) estimation is the key to many audio applications. Recently, sparse component analysis (SCA)-based methods have attracted much attention, in which single-source points (SSPs) and single-source zones (SSZs) where one source is dominant over the others in time-frequency domai...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE sensors journal 2023-05, Vol.23 (10), p.1-1
Hauptverfasser:	Zhao, Zhao, Kan, Hongrui, Lin, Jiale, Xu, Zhiyong
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Angular resolution Bins concentration weighting Correlation coefficients Direction of arrival Direction-of-arrival estimation DOA estimation Estimation flexible single-source zone Histograms Matched pursuit microphone array Microphone arrays Reverberation Sensors Signal to noise ratio Speech Time lag Time-frequency analysis Weighting
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1
container_issue	10
container_start_page	1
container_title	IEEE sensors journal
container_volume	23
creator	Zhao, Zhao Kan, Hongrui Lin, Jiale Xu, Zhiyong
description	Direction of arrival (DOA) estimation is the key to many audio applications. Recently, sparse component analysis (SCA)-based methods have attracted much attention, in which single-source points (SSPs) and single-source zones (SSZs) where one source is dominant over the others in time-frequency domain are usually detected to construct the pooled histogram containing multi-source DOA information. Nonetheless, the SSZ size in existing methods is fixed and empirically predetermined, which cannot accommodate to the varying spectro-temporal property of speech sources. Furthermore, higher SSP concentration in a SSZ implies a locally stronger dominant source as well as more reliable DOA information extracted therein, which however is also not taken into account yet. To address these problems, a DOA estimation algorithm for multiple speech sources based on flexible SSZs and concentration weighting is presented in this paper. First, in each frame, correlation coefficients of time delay vectors across adjacent frequency bins are calculated to identify SSPs, followed by flexible SSZs construction using varying number of SSPs located at consecutive frequency bins. Next, the number of SSPs in each flexible SSZ is considered as a proxy of corresponding concentration degree, and employed as weighting factor to form the pooled histogram. Finally, a matching pursuit (MP)-based approach is utilized to obtain multi-source DOA estimates. Simulation results reveal that the proposed method significantly outperforms existing approaches in terms of noise floor in pooled histogram, angular resolution, and performance under various signal-to-noise ratio and reverberant conditions. Real-world experiments also verify its effectiveness, and meanwhile demonstrate considerably reduced computational complexity.
doi_str_mv	10.1109/JSEN.2023.3263861
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_JSEN_2023_3263861</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10097554</ieee_id><sourcerecordid>2814572192</sourcerecordid><originalsourceid>FETCH-LOGICAL-c294t-524c6b41d06deda470eabe1bc798fec82c025ae1d944115aa1b791f0583d6ebe3</originalsourceid><addsrcrecordid>eNpNkMtOwzAQRS0EEqXwAUgsLLFO8fgRJ8tSWh4qdFEQiE3kOJM2VUhCnErw9zhKF6xmFufeGR1CLoFNAFh887Sev0w442IieCiiEI7ICJSKAtAyOu53wQIp9McpOXNuxxjEWukRKe9WUzp3XfFluqKuaF639HlfdkVTIl03iHZL1_W-tejorXGYUQ8tSvwp0h4oqk2JwQDQz7rylKkyOqsri1XXDp3vWGy2nUfPyUluSocXhzkmb4v56-whWK7uH2fTZWB5LLtAcWnDVELGwgwzIzVDkyKkVsdRjjbilnFlELJYSgBlDKQ6hpypSGQhpijG5Hrobdr6e4-uS3b-w8qfTHgEUmkOMfcUDJRta-dazJOm9Rra3wRY0ktNeqlJLzU5SPWZqyFTIOI_nnmbSoo_uet0OA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2814572192</pqid></control><display><type>article</type><title>DOA Estimation for Multiple Speech Sources Based on Flexible Single-Source Zones and Concentration Weighting</title><source>IEEE Xplore</source><creator>Zhao, Zhao ; Kan, Hongrui ; Lin, Jiale ; Xu, Zhiyong</creator><creatorcontrib>Zhao, Zhao ; Kan, Hongrui ; Lin, Jiale ; Xu, Zhiyong</creatorcontrib><description>Direction of arrival (DOA) estimation is the key to many audio applications. Recently, sparse component analysis (SCA)-based methods have attracted much attention, in which single-source points (SSPs) and single-source zones (SSZs) where one source is dominant over the others in time-frequency domain are usually detected to construct the pooled histogram containing multi-source DOA information. Nonetheless, the SSZ size in existing methods is fixed and empirically predetermined, which cannot accommodate to the varying spectro-temporal property of speech sources. Furthermore, higher SSP concentration in a SSZ implies a locally stronger dominant source as well as more reliable DOA information extracted therein, which however is also not taken into account yet. To address these problems, a DOA estimation algorithm for multiple speech sources based on flexible SSZs and concentration weighting is presented in this paper. First, in each frame, correlation coefficients of time delay vectors across adjacent frequency bins are calculated to identify SSPs, followed by flexible SSZs construction using varying number of SSPs located at consecutive frequency bins. Next, the number of SSPs in each flexible SSZ is considered as a proxy of corresponding concentration degree, and employed as weighting factor to form the pooled histogram. Finally, a matching pursuit (MP)-based approach is utilized to obtain multi-source DOA estimates. Simulation results reveal that the proposed method significantly outperforms existing approaches in terms of noise floor in pooled histogram, angular resolution, and performance under various signal-to-noise ratio and reverberant conditions. Real-world experiments also verify its effectiveness, and meanwhile demonstrate considerably reduced computational complexity.</description><identifier>ISSN: 1530-437X</identifier><identifier>EISSN: 1558-1748</identifier><identifier>DOI: 10.1109/JSEN.2023.3263861</identifier><identifier>CODEN: ISJEAZ</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Angular resolution ; Bins ; concentration weighting ; Correlation coefficients ; Direction of arrival ; Direction-of-arrival estimation ; DOA estimation ; Estimation ; flexible single-source zone ; Histograms ; Matched pursuit ; microphone array ; Microphone arrays ; Reverberation ; Sensors ; Signal to noise ratio ; Speech ; Time lag ; Time-frequency analysis ; Weighting</subject><ispartof>IEEE sensors journal, 2023-05, Vol.23 (10), p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c294t-524c6b41d06deda470eabe1bc798fec82c025ae1d944115aa1b791f0583d6ebe3</citedby><cites>FETCH-LOGICAL-c294t-524c6b41d06deda470eabe1bc798fec82c025ae1d944115aa1b791f0583d6ebe3</cites><orcidid>0000-0001-7901-500X ; 0000-0003-4470-0888</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10097554$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10097554$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Zhao, Zhao</creatorcontrib><creatorcontrib>Kan, Hongrui</creatorcontrib><creatorcontrib>Lin, Jiale</creatorcontrib><creatorcontrib>Xu, Zhiyong</creatorcontrib><title>DOA Estimation for Multiple Speech Sources Based on Flexible Single-Source Zones and Concentration Weighting</title><title>IEEE sensors journal</title><addtitle>JSEN</addtitle><description>Direction of arrival (DOA) estimation is the key to many audio applications. Recently, sparse component analysis (SCA)-based methods have attracted much attention, in which single-source points (SSPs) and single-source zones (SSZs) where one source is dominant over the others in time-frequency domain are usually detected to construct the pooled histogram containing multi-source DOA information. Nonetheless, the SSZ size in existing methods is fixed and empirically predetermined, which cannot accommodate to the varying spectro-temporal property of speech sources. Furthermore, higher SSP concentration in a SSZ implies a locally stronger dominant source as well as more reliable DOA information extracted therein, which however is also not taken into account yet. To address these problems, a DOA estimation algorithm for multiple speech sources based on flexible SSZs and concentration weighting is presented in this paper. First, in each frame, correlation coefficients of time delay vectors across adjacent frequency bins are calculated to identify SSPs, followed by flexible SSZs construction using varying number of SSPs located at consecutive frequency bins. Next, the number of SSPs in each flexible SSZ is considered as a proxy of corresponding concentration degree, and employed as weighting factor to form the pooled histogram. Finally, a matching pursuit (MP)-based approach is utilized to obtain multi-source DOA estimates. Simulation results reveal that the proposed method significantly outperforms existing approaches in terms of noise floor in pooled histogram, angular resolution, and performance under various signal-to-noise ratio and reverberant conditions. Real-world experiments also verify its effectiveness, and meanwhile demonstrate considerably reduced computational complexity.</description><subject>Algorithms</subject><subject>Angular resolution</subject><subject>Bins</subject><subject>concentration weighting</subject><subject>Correlation coefficients</subject><subject>Direction of arrival</subject><subject>Direction-of-arrival estimation</subject><subject>DOA estimation</subject><subject>Estimation</subject><subject>flexible single-source zone</subject><subject>Histograms</subject><subject>Matched pursuit</subject><subject>microphone array</subject><subject>Microphone arrays</subject><subject>Reverberation</subject><subject>Sensors</subject><subject>Signal to noise ratio</subject><subject>Speech</subject><subject>Time lag</subject><subject>Time-frequency analysis</subject><subject>Weighting</subject><issn>1530-437X</issn><issn>1558-1748</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkMtOwzAQRS0EEqXwAUgsLLFO8fgRJ8tSWh4qdFEQiE3kOJM2VUhCnErw9zhKF6xmFufeGR1CLoFNAFh887Sev0w442IieCiiEI7ICJSKAtAyOu53wQIp9McpOXNuxxjEWukRKe9WUzp3XfFluqKuaF639HlfdkVTIl03iHZL1_W-tejorXGYUQ8tSvwp0h4oqk2JwQDQz7rylKkyOqsri1XXDp3vWGy2nUfPyUluSocXhzkmb4v56-whWK7uH2fTZWB5LLtAcWnDVELGwgwzIzVDkyKkVsdRjjbilnFlELJYSgBlDKQ6hpypSGQhpijG5Hrobdr6e4-uS3b-w8qfTHgEUmkOMfcUDJRta-dazJOm9Rra3wRY0ktNeqlJLzU5SPWZqyFTIOI_nnmbSoo_uet0OA</recordid><startdate>20230515</startdate><enddate>20230515</enddate><creator>Zhao, Zhao</creator><creator>Kan, Hongrui</creator><creator>Lin, Jiale</creator><creator>Xu, Zhiyong</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>7U5</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0001-7901-500X</orcidid><orcidid>https://orcid.org/0000-0003-4470-0888</orcidid></search><sort><creationdate>20230515</creationdate><title>DOA Estimation for Multiple Speech Sources Based on Flexible Single-Source Zones and Concentration Weighting</title><author>Zhao, Zhao ; Kan, Hongrui ; Lin, Jiale ; Xu, Zhiyong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c294t-524c6b41d06deda470eabe1bc798fec82c025ae1d944115aa1b791f0583d6ebe3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Algorithms</topic><topic>Angular resolution</topic><topic>Bins</topic><topic>concentration weighting</topic><topic>Correlation coefficients</topic><topic>Direction of arrival</topic><topic>Direction-of-arrival estimation</topic><topic>DOA estimation</topic><topic>Estimation</topic><topic>flexible single-source zone</topic><topic>Histograms</topic><topic>Matched pursuit</topic><topic>microphone array</topic><topic>Microphone arrays</topic><topic>Reverberation</topic><topic>Sensors</topic><topic>Signal to noise ratio</topic><topic>Speech</topic><topic>Time lag</topic><topic>Time-frequency analysis</topic><topic>Weighting</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhao, Zhao</creatorcontrib><creatorcontrib>Kan, Hongrui</creatorcontrib><creatorcontrib>Lin, Jiale</creatorcontrib><creatorcontrib>Xu, Zhiyong</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE sensors journal</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhao, Zhao</au><au>Kan, Hongrui</au><au>Lin, Jiale</au><au>Xu, Zhiyong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>DOA Estimation for Multiple Speech Sources Based on Flexible Single-Source Zones and Concentration Weighting</atitle><jtitle>IEEE sensors journal</jtitle><stitle>JSEN</stitle><date>2023-05-15</date><risdate>2023</risdate><volume>23</volume><issue>10</issue><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>1530-437X</issn><eissn>1558-1748</eissn><coden>ISJEAZ</coden><abstract>Direction of arrival (DOA) estimation is the key to many audio applications. Recently, sparse component analysis (SCA)-based methods have attracted much attention, in which single-source points (SSPs) and single-source zones (SSZs) where one source is dominant over the others in time-frequency domain are usually detected to construct the pooled histogram containing multi-source DOA information. Nonetheless, the SSZ size in existing methods is fixed and empirically predetermined, which cannot accommodate to the varying spectro-temporal property of speech sources. Furthermore, higher SSP concentration in a SSZ implies a locally stronger dominant source as well as more reliable DOA information extracted therein, which however is also not taken into account yet. To address these problems, a DOA estimation algorithm for multiple speech sources based on flexible SSZs and concentration weighting is presented in this paper. First, in each frame, correlation coefficients of time delay vectors across adjacent frequency bins are calculated to identify SSPs, followed by flexible SSZs construction using varying number of SSPs located at consecutive frequency bins. Next, the number of SSPs in each flexible SSZ is considered as a proxy of corresponding concentration degree, and employed as weighting factor to form the pooled histogram. Finally, a matching pursuit (MP)-based approach is utilized to obtain multi-source DOA estimates. Simulation results reveal that the proposed method significantly outperforms existing approaches in terms of noise floor in pooled histogram, angular resolution, and performance under various signal-to-noise ratio and reverberant conditions. Real-world experiments also verify its effectiveness, and meanwhile demonstrate considerably reduced computational complexity.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/JSEN.2023.3263861</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0001-7901-500X</orcidid><orcidid>https://orcid.org/0000-0003-4470-0888</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1530-437X
ispartof	IEEE sensors journal, 2023-05, Vol.23 (10), p.1-1
issn	1530-437X 1558-1748
language	eng
recordid	cdi_crossref_primary_10_1109_JSEN_2023_3263861
source	IEEE Xplore
subjects	Algorithms Angular resolution Bins concentration weighting Correlation coefficients Direction of arrival Direction-of-arrival estimation DOA estimation Estimation flexible single-source zone Histograms Matched pursuit microphone array Microphone arrays Reverberation Sensors Signal to noise ratio Speech Time lag Time-frequency analysis Weighting
title	DOA Estimation for Multiple Speech Sources Based on Flexible Single-Source Zones and Concentration Weighting
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T17%3A37%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=DOA%20Estimation%20for%20Multiple%20Speech%20Sources%20Based%20on%20Flexible%20Single-Source%20Zones%20and%20Concentration%20Weighting&rft.jtitle=IEEE%20sensors%20journal&rft.au=Zhao,%20Zhao&rft.date=2023-05-15&rft.volume=23&rft.issue=10&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=1530-437X&rft.eissn=1558-1748&rft.coden=ISJEAZ&rft_id=info:doi/10.1109/JSEN.2023.3263861&rft_dat=%3Cproquest_RIE%3E2814572192%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2814572192&rft_id=info:pmid/&rft_ieee_id=10097554&rfr_iscdi=true