Cognitive-driven convolutional beamforming using EEG-based auditory attention decoding

The performance of speech enhancement algorithms in a multi-speaker scenario depends on correctly identifying the target speaker to be enhanced. Auditory attention decoding (AAD) methods allow to identify the target speaker which the listener is attending to from single-trial EEG recordings. Aiming...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Aroudi, Ali, Delcroix, Marc, Nakatani, Tomohiro, Kinoshita, Keisuke, Araki, Shoko, Doclo, Simon
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Aroudi, Ali
Delcroix, Marc
Nakatani, Tomohiro
Kinoshita, Keisuke
Araki, Shoko
Doclo, Simon
description The performance of speech enhancement algorithms in a multi-speaker scenario depends on correctly identifying the target speaker to be enhanced. Auditory attention decoding (AAD) methods allow to identify the target speaker which the listener is attending to from single-trial EEG recordings. Aiming at enhancing the target speaker and suppressing interfering speakers, reverberation and ambient noise, in this paper we propose a cognitive-driven multi-microphone speech enhancement system, which combines a neural-network-based mask estimator, weighted minimum power distortionless response convolutional beamformers and AAD. To control the suppression of the interfering speaker, we also propose an extension incorporating an interference suppression constraint. The experimental results show that the proposed system outperforms the state-of-the-art cognitive-driven speech enhancement systems in challenging reverberant and noisy conditions.
doi_str_mv 10.48550/arxiv.2005.04669
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2005_04669</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2005_04669</sourcerecordid><originalsourceid>FETCH-LOGICAL-a679-baf5cf37e1ceb7e4ac3988dc4ef1630667cdd33650c79161b10b93869945151a3</originalsourceid><addsrcrecordid>eNotj71qwzAYRbV0KEkfoFP1Anal6s8ag3HTQqBLyGo-S3IQ2FKRZdO8fe20y73LvQcOQs-UlLwSgrxC-vFL-UaIKAmXUj-iSx2vwWe_uMKmNQM2MSxxmLOPAQbcORj7mEYfrnietmyaY9HB5CyG2foc0w1Dzi5sB2ydiXZd7dFDD8Pknv57h87vzbn-KE5fx8_6cCpAKr1iemF6phw1rlOOg2G6qqzhrqeSESmVsZYxKYhRmkraUdJpVkmtuaCCAtuhlz_sXaz9Tn6EdGs3wfYuyH4BQchMtQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Cognitive-driven convolutional beamforming using EEG-based auditory attention decoding</title><source>arXiv.org</source><creator>Aroudi, Ali ; Delcroix, Marc ; Nakatani, Tomohiro ; Kinoshita, Keisuke ; Araki, Shoko ; Doclo, Simon</creator><creatorcontrib>Aroudi, Ali ; Delcroix, Marc ; Nakatani, Tomohiro ; Kinoshita, Keisuke ; Araki, Shoko ; Doclo, Simon</creatorcontrib><description>The performance of speech enhancement algorithms in a multi-speaker scenario depends on correctly identifying the target speaker to be enhanced. Auditory attention decoding (AAD) methods allow to identify the target speaker which the listener is attending to from single-trial EEG recordings. Aiming at enhancing the target speaker and suppressing interfering speakers, reverberation and ambient noise, in this paper we propose a cognitive-driven multi-microphone speech enhancement system, which combines a neural-network-based mask estimator, weighted minimum power distortionless response convolutional beamformers and AAD. To control the suppression of the interfering speaker, we also propose an extension incorporating an interference suppression constraint. The experimental results show that the proposed system outperforms the state-of-the-art cognitive-driven speech enhancement systems in challenging reverberant and noisy conditions.</description><identifier>DOI: 10.48550/arxiv.2005.04669</identifier><language>eng</language><subject>Computer Science - Sound</subject><creationdate>2020-05</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2005.04669$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2005.04669$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Aroudi, Ali</creatorcontrib><creatorcontrib>Delcroix, Marc</creatorcontrib><creatorcontrib>Nakatani, Tomohiro</creatorcontrib><creatorcontrib>Kinoshita, Keisuke</creatorcontrib><creatorcontrib>Araki, Shoko</creatorcontrib><creatorcontrib>Doclo, Simon</creatorcontrib><title>Cognitive-driven convolutional beamforming using EEG-based auditory attention decoding</title><description>The performance of speech enhancement algorithms in a multi-speaker scenario depends on correctly identifying the target speaker to be enhanced. Auditory attention decoding (AAD) methods allow to identify the target speaker which the listener is attending to from single-trial EEG recordings. Aiming at enhancing the target speaker and suppressing interfering speakers, reverberation and ambient noise, in this paper we propose a cognitive-driven multi-microphone speech enhancement system, which combines a neural-network-based mask estimator, weighted minimum power distortionless response convolutional beamformers and AAD. To control the suppression of the interfering speaker, we also propose an extension incorporating an interference suppression constraint. The experimental results show that the proposed system outperforms the state-of-the-art cognitive-driven speech enhancement systems in challenging reverberant and noisy conditions.</description><subject>Computer Science - Sound</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj71qwzAYRbV0KEkfoFP1Anal6s8ag3HTQqBLyGo-S3IQ2FKRZdO8fe20y73LvQcOQs-UlLwSgrxC-vFL-UaIKAmXUj-iSx2vwWe_uMKmNQM2MSxxmLOPAQbcORj7mEYfrnietmyaY9HB5CyG2foc0w1Dzi5sB2ydiXZd7dFDD8Pknv57h87vzbn-KE5fx8_6cCpAKr1iemF6phw1rlOOg2G6qqzhrqeSESmVsZYxKYhRmkraUdJpVkmtuaCCAtuhlz_sXaz9Tn6EdGs3wfYuyH4BQchMtQ</recordid><startdate>20200510</startdate><enddate>20200510</enddate><creator>Aroudi, Ali</creator><creator>Delcroix, Marc</creator><creator>Nakatani, Tomohiro</creator><creator>Kinoshita, Keisuke</creator><creator>Araki, Shoko</creator><creator>Doclo, Simon</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20200510</creationdate><title>Cognitive-driven convolutional beamforming using EEG-based auditory attention decoding</title><author>Aroudi, Ali ; Delcroix, Marc ; Nakatani, Tomohiro ; Kinoshita, Keisuke ; Araki, Shoko ; Doclo, Simon</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a679-baf5cf37e1ceb7e4ac3988dc4ef1630667cdd33650c79161b10b93869945151a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer Science - Sound</topic><toplevel>online_resources</toplevel><creatorcontrib>Aroudi, Ali</creatorcontrib><creatorcontrib>Delcroix, Marc</creatorcontrib><creatorcontrib>Nakatani, Tomohiro</creatorcontrib><creatorcontrib>Kinoshita, Keisuke</creatorcontrib><creatorcontrib>Araki, Shoko</creatorcontrib><creatorcontrib>Doclo, Simon</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Aroudi, Ali</au><au>Delcroix, Marc</au><au>Nakatani, Tomohiro</au><au>Kinoshita, Keisuke</au><au>Araki, Shoko</au><au>Doclo, Simon</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Cognitive-driven convolutional beamforming using EEG-based auditory attention decoding</atitle><date>2020-05-10</date><risdate>2020</risdate><abstract>The performance of speech enhancement algorithms in a multi-speaker scenario depends on correctly identifying the target speaker to be enhanced. Auditory attention decoding (AAD) methods allow to identify the target speaker which the listener is attending to from single-trial EEG recordings. Aiming at enhancing the target speaker and suppressing interfering speakers, reverberation and ambient noise, in this paper we propose a cognitive-driven multi-microphone speech enhancement system, which combines a neural-network-based mask estimator, weighted minimum power distortionless response convolutional beamformers and AAD. To control the suppression of the interfering speaker, we also propose an extension incorporating an interference suppression constraint. The experimental results show that the proposed system outperforms the state-of-the-art cognitive-driven speech enhancement systems in challenging reverberant and noisy conditions.</abstract><doi>10.48550/arxiv.2005.04669</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2005.04669
ispartof
issn
language eng
recordid cdi_arxiv_primary_2005_04669
source arXiv.org
subjects Computer Science - Sound
title Cognitive-driven convolutional beamforming using EEG-based auditory attention decoding
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T21%3A02%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Cognitive-driven%20convolutional%20beamforming%20using%20EEG-based%20auditory%20attention%20decoding&rft.au=Aroudi,%20Ali&rft.date=2020-05-10&rft_id=info:doi/10.48550/arxiv.2005.04669&rft_dat=%3Carxiv_GOX%3E2005_04669%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true