Exploiting alternative acoustic sensors for improved noise robustness in speech communication

•This study is focused on speech communication using body-conducted sensors.•The sensors are evaluated using subjective tests and automatic speech recognition.•Improvements are obtained when using fusion of different sensors.•A fusion method is proposed, which does not require adjustment of weights....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Pattern recognition letters 2018-09, Vol.112, p.191-197
Hauptverfasser: Heracleous, Panikos, Even, Jani, Sugaya, Fumiaki, Hashimoto, Masayuki, Yoneyama, Akio
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 197
container_issue
container_start_page 191
container_title Pattern recognition letters
container_volume 112
creator Heracleous, Panikos
Even, Jani
Sugaya, Fumiaki
Hashimoto, Masayuki
Yoneyama, Akio
description •This study is focused on speech communication using body-conducted sensors.•The sensors are evaluated using subjective tests and automatic speech recognition.•Improvements are obtained when using fusion of different sensors.•A fusion method is proposed, which does not require adjustment of weights.•A method is also suggested for segmenting noisy speech data. This study investigates the use of non-conventional body-conductive acoustic sensors in human-human speech communication and automatic speech recognition. The body-conductive sensors are directly attached to the speaker and receive the uttered speech through the skin and bones, resulting in higher robustness against environmental noise. In this study, a throat microphone, an ear bone microphone, and a standard microphone were evaluated using subjective speech intelligibility tests and automatic speech recognition experiments. In addition to the use of these sensors on their own, several methods were also applied for sensor integration, thereby achieving higher recognition rates. Namely, multi-stream hidden Markov model (HMM) decision fusion, and late fusion methods were used to integrate several sensors. By using late fusion, a 40% relative recognition rate improvement in a noisy environment, and a 24% relative recognition rate improvement in a clean environment were achieved. In the case of late fusion, a novel adaptive weighting method was introduced that does not require any pre-adjustment of the weights. In this study, a technique to automatically segment noisy speech data by using a body-conductive sensor in conjunction with the desired microphone during recording is presented. The Lombard effect phenomenon when using body-conductive acoustic sensors was also investigated.
doi_str_mv 10.1016/j.patrec.2018.07.014
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2119938864</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0167865518303040</els_id><sourcerecordid>2119938864</sourcerecordid><originalsourceid>FETCH-LOGICAL-c334t-ec66c96b366acaf831d024724c1041381ec004faf02f4680bffae87f532d5b1e3</originalsourceid><addsrcrecordid>eNp9kE1LxDAURYMoOI7-AxcB161Jk6aZjSDD-AEDbnQpIZO-aEqb1KQz6L83Q127eptzL-8ehK4pKSmh4rYrRz1FMGVFqCxJUxLKT9CCyqYqGsb5KVpkrCmkqOtzdJFSRwgRbCUX6H3zPfbBTc5_YN1PEL2e3AGwNmGfJmdwAp9CTNiGiN0wxnCAFvvgEuAYdpnxkBJ2HqcRwHxiE4Zh753JNcFfojOr-wRXf3eJ3h42r-unYvvy-Ly-3xaGMT4VYIQwK7FjQmijrWS0JRVvKm4o4ZRJCoYQbrUlleVCkp21GmRja1a19Y4CW6KbuTf_97WHNKku7POUPqmK0tWKSSl4pvhMmRhSimDVGN2g44-iRB1Fqk7NItVRpCKNyiJz7G6OQV5wcBBVMg68gdZldFJtcP8X_AJ7eYBH</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2119938864</pqid></control><display><type>article</type><title>Exploiting alternative acoustic sensors for improved noise robustness in speech communication</title><source>Elsevier ScienceDirect Journals</source><creator>Heracleous, Panikos ; Even, Jani ; Sugaya, Fumiaki ; Hashimoto, Masayuki ; Yoneyama, Akio</creator><creatorcontrib>Heracleous, Panikos ; Even, Jani ; Sugaya, Fumiaki ; Hashimoto, Masayuki ; Yoneyama, Akio</creatorcontrib><description>•This study is focused on speech communication using body-conducted sensors.•The sensors are evaluated using subjective tests and automatic speech recognition.•Improvements are obtained when using fusion of different sensors.•A fusion method is proposed, which does not require adjustment of weights.•A method is also suggested for segmenting noisy speech data. This study investigates the use of non-conventional body-conductive acoustic sensors in human-human speech communication and automatic speech recognition. The body-conductive sensors are directly attached to the speaker and receive the uttered speech through the skin and bones, resulting in higher robustness against environmental noise. In this study, a throat microphone, an ear bone microphone, and a standard microphone were evaluated using subjective speech intelligibility tests and automatic speech recognition experiments. In addition to the use of these sensors on their own, several methods were also applied for sensor integration, thereby achieving higher recognition rates. Namely, multi-stream hidden Markov model (HMM) decision fusion, and late fusion methods were used to integrate several sensors. By using late fusion, a 40% relative recognition rate improvement in a noisy environment, and a 24% relative recognition rate improvement in a clean environment were achieved. In the case of late fusion, a novel adaptive weighting method was introduced that does not require any pre-adjustment of the weights. In this study, a technique to automatically segment noisy speech data by using a body-conductive sensor in conjunction with the desired microphone during recording is presented. The Lombard effect phenomenon when using body-conductive acoustic sensors was also investigated.</description><identifier>ISSN: 0167-8655</identifier><identifier>EISSN: 1872-7344</identifier><identifier>DOI: 10.1016/j.patrec.2018.07.014</identifier><language>eng</language><publisher>Amsterdam: Elsevier B.V</publisher><subject>Acoustic noise ; Acoustics ; Auditory system ; Automatic speech recognition ; Background noise ; Body-conducted sensors ; Bones ; Communication ; Conductivity ; Experiments ; Fusion ; Hidden Markov models (HMMs) ; Intelligibility ; Markov analysis ; Markov chains ; Noise ; Noise robustness ; Pharynx ; Recording ; Robustness ; Sensors ; Skin ; Speech ; Speech intelligibility ; Speech recognition ; Speech tests ; Voice recognition</subject><ispartof>Pattern recognition letters, 2018-09, Vol.112, p.191-197</ispartof><rights>2018 Elsevier B.V.</rights><rights>Copyright Elsevier Science Ltd. Sep 1, 2018</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c334t-ec66c96b366acaf831d024724c1041381ec004faf02f4680bffae87f532d5b1e3</citedby><cites>FETCH-LOGICAL-c334t-ec66c96b366acaf831d024724c1041381ec004faf02f4680bffae87f532d5b1e3</cites><orcidid>0000-0002-7709-8195</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0167865518303040$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65306</link.rule.ids></links><search><creatorcontrib>Heracleous, Panikos</creatorcontrib><creatorcontrib>Even, Jani</creatorcontrib><creatorcontrib>Sugaya, Fumiaki</creatorcontrib><creatorcontrib>Hashimoto, Masayuki</creatorcontrib><creatorcontrib>Yoneyama, Akio</creatorcontrib><title>Exploiting alternative acoustic sensors for improved noise robustness in speech communication</title><title>Pattern recognition letters</title><description>•This study is focused on speech communication using body-conducted sensors.•The sensors are evaluated using subjective tests and automatic speech recognition.•Improvements are obtained when using fusion of different sensors.•A fusion method is proposed, which does not require adjustment of weights.•A method is also suggested for segmenting noisy speech data. This study investigates the use of non-conventional body-conductive acoustic sensors in human-human speech communication and automatic speech recognition. The body-conductive sensors are directly attached to the speaker and receive the uttered speech through the skin and bones, resulting in higher robustness against environmental noise. In this study, a throat microphone, an ear bone microphone, and a standard microphone were evaluated using subjective speech intelligibility tests and automatic speech recognition experiments. In addition to the use of these sensors on their own, several methods were also applied for sensor integration, thereby achieving higher recognition rates. Namely, multi-stream hidden Markov model (HMM) decision fusion, and late fusion methods were used to integrate several sensors. By using late fusion, a 40% relative recognition rate improvement in a noisy environment, and a 24% relative recognition rate improvement in a clean environment were achieved. In the case of late fusion, a novel adaptive weighting method was introduced that does not require any pre-adjustment of the weights. In this study, a technique to automatically segment noisy speech data by using a body-conductive sensor in conjunction with the desired microphone during recording is presented. The Lombard effect phenomenon when using body-conductive acoustic sensors was also investigated.</description><subject>Acoustic noise</subject><subject>Acoustics</subject><subject>Auditory system</subject><subject>Automatic speech recognition</subject><subject>Background noise</subject><subject>Body-conducted sensors</subject><subject>Bones</subject><subject>Communication</subject><subject>Conductivity</subject><subject>Experiments</subject><subject>Fusion</subject><subject>Hidden Markov models (HMMs)</subject><subject>Intelligibility</subject><subject>Markov analysis</subject><subject>Markov chains</subject><subject>Noise</subject><subject>Noise robustness</subject><subject>Pharynx</subject><subject>Recording</subject><subject>Robustness</subject><subject>Sensors</subject><subject>Skin</subject><subject>Speech</subject><subject>Speech intelligibility</subject><subject>Speech recognition</subject><subject>Speech tests</subject><subject>Voice recognition</subject><issn>0167-8655</issn><issn>1872-7344</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LxDAURYMoOI7-AxcB161Jk6aZjSDD-AEDbnQpIZO-aEqb1KQz6L83Q127eptzL-8ehK4pKSmh4rYrRz1FMGVFqCxJUxLKT9CCyqYqGsb5KVpkrCmkqOtzdJFSRwgRbCUX6H3zPfbBTc5_YN1PEL2e3AGwNmGfJmdwAp9CTNiGiN0wxnCAFvvgEuAYdpnxkBJ2HqcRwHxiE4Zh753JNcFfojOr-wRXf3eJ3h42r-unYvvy-Ly-3xaGMT4VYIQwK7FjQmijrWS0JRVvKm4o4ZRJCoYQbrUlleVCkp21GmRja1a19Y4CW6KbuTf_97WHNKku7POUPqmK0tWKSSl4pvhMmRhSimDVGN2g44-iRB1Fqk7NItVRpCKNyiJz7G6OQV5wcBBVMg68gdZldFJtcP8X_AJ7eYBH</recordid><startdate>20180901</startdate><enddate>20180901</enddate><creator>Heracleous, Panikos</creator><creator>Even, Jani</creator><creator>Sugaya, Fumiaki</creator><creator>Hashimoto, Masayuki</creator><creator>Yoneyama, Akio</creator><general>Elsevier B.V</general><general>Elsevier Science Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7T9</scope><scope>7TK</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-7709-8195</orcidid></search><sort><creationdate>20180901</creationdate><title>Exploiting alternative acoustic sensors for improved noise robustness in speech communication</title><author>Heracleous, Panikos ; Even, Jani ; Sugaya, Fumiaki ; Hashimoto, Masayuki ; Yoneyama, Akio</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c334t-ec66c96b366acaf831d024724c1041381ec004faf02f4680bffae87f532d5b1e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Acoustic noise</topic><topic>Acoustics</topic><topic>Auditory system</topic><topic>Automatic speech recognition</topic><topic>Background noise</topic><topic>Body-conducted sensors</topic><topic>Bones</topic><topic>Communication</topic><topic>Conductivity</topic><topic>Experiments</topic><topic>Fusion</topic><topic>Hidden Markov models (HMMs)</topic><topic>Intelligibility</topic><topic>Markov analysis</topic><topic>Markov chains</topic><topic>Noise</topic><topic>Noise robustness</topic><topic>Pharynx</topic><topic>Recording</topic><topic>Robustness</topic><topic>Sensors</topic><topic>Skin</topic><topic>Speech</topic><topic>Speech intelligibility</topic><topic>Speech recognition</topic><topic>Speech tests</topic><topic>Voice recognition</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Heracleous, Panikos</creatorcontrib><creatorcontrib>Even, Jani</creatorcontrib><creatorcontrib>Sugaya, Fumiaki</creatorcontrib><creatorcontrib>Hashimoto, Masayuki</creatorcontrib><creatorcontrib>Yoneyama, Akio</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><collection>Neurosciences Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Pattern recognition letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Heracleous, Panikos</au><au>Even, Jani</au><au>Sugaya, Fumiaki</au><au>Hashimoto, Masayuki</au><au>Yoneyama, Akio</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Exploiting alternative acoustic sensors for improved noise robustness in speech communication</atitle><jtitle>Pattern recognition letters</jtitle><date>2018-09-01</date><risdate>2018</risdate><volume>112</volume><spage>191</spage><epage>197</epage><pages>191-197</pages><issn>0167-8655</issn><eissn>1872-7344</eissn><abstract>•This study is focused on speech communication using body-conducted sensors.•The sensors are evaluated using subjective tests and automatic speech recognition.•Improvements are obtained when using fusion of different sensors.•A fusion method is proposed, which does not require adjustment of weights.•A method is also suggested for segmenting noisy speech data. This study investigates the use of non-conventional body-conductive acoustic sensors in human-human speech communication and automatic speech recognition. The body-conductive sensors are directly attached to the speaker and receive the uttered speech through the skin and bones, resulting in higher robustness against environmental noise. In this study, a throat microphone, an ear bone microphone, and a standard microphone were evaluated using subjective speech intelligibility tests and automatic speech recognition experiments. In addition to the use of these sensors on their own, several methods were also applied for sensor integration, thereby achieving higher recognition rates. Namely, multi-stream hidden Markov model (HMM) decision fusion, and late fusion methods were used to integrate several sensors. By using late fusion, a 40% relative recognition rate improvement in a noisy environment, and a 24% relative recognition rate improvement in a clean environment were achieved. In the case of late fusion, a novel adaptive weighting method was introduced that does not require any pre-adjustment of the weights. In this study, a technique to automatically segment noisy speech data by using a body-conductive sensor in conjunction with the desired microphone during recording is presented. The Lombard effect phenomenon when using body-conductive acoustic sensors was also investigated.</abstract><cop>Amsterdam</cop><pub>Elsevier B.V</pub><doi>10.1016/j.patrec.2018.07.014</doi><tpages>7</tpages><orcidid>https://orcid.org/0000-0002-7709-8195</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0167-8655
ispartof Pattern recognition letters, 2018-09, Vol.112, p.191-197
issn 0167-8655
1872-7344
language eng
recordid cdi_proquest_journals_2119938864
source Elsevier ScienceDirect Journals
subjects Acoustic noise
Acoustics
Auditory system
Automatic speech recognition
Background noise
Body-conducted sensors
Bones
Communication
Conductivity
Experiments
Fusion
Hidden Markov models (HMMs)
Intelligibility
Markov analysis
Markov chains
Noise
Noise robustness
Pharynx
Recording
Robustness
Sensors
Skin
Speech
Speech intelligibility
Speech recognition
Speech tests
Voice recognition
title Exploiting alternative acoustic sensors for improved noise robustness in speech communication
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T07%3A15%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Exploiting%20alternative%20acoustic%20sensors%20for%20improved%20noise%20robustness%20in%20speech%20communication&rft.jtitle=Pattern%20recognition%20letters&rft.au=Heracleous,%20Panikos&rft.date=2018-09-01&rft.volume=112&rft.spage=191&rft.epage=197&rft.pages=191-197&rft.issn=0167-8655&rft.eissn=1872-7344&rft_id=info:doi/10.1016/j.patrec.2018.07.014&rft_dat=%3Cproquest_cross%3E2119938864%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2119938864&rft_id=info:pmid/&rft_els_id=S0167865518303040&rfr_iscdi=true