Exploiting alternative acoustic sensors for improved noise robustness in speech communication

•This study is focused on speech communication using body-conducted sensors.•The sensors are evaluated using subjective tests and automatic speech recognition.•Improvements are obtained when using fusion of different sensors.•A fusion method is proposed, which does not require adjustment of weights....

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Pattern recognition letters 2018-09, Vol.112, p.191-197
Hauptverfasser:	Heracleous, Panikos, Even, Jani, Sugaya, Fumiaki, Hashimoto, Masayuki, Yoneyama, Akio
Format:	Artikel
Sprache:	eng
Schlagworte:	Acoustic noise Acoustics Auditory system Automatic speech recognition Background noise Body-conducted sensors Bones Communication Conductivity Experiments Fusion Hidden Markov models (HMMs) Intelligibility Markov analysis Markov chains Noise Noise robustness Pharynx Recording Robustness Sensors Skin Speech Speech intelligibility Speech recognition Speech tests Voice recognition
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	•This study is focused on speech communication using body-conducted sensors.•The sensors are evaluated using subjective tests and automatic speech recognition.•Improvements are obtained when using fusion of different sensors.•A fusion method is proposed, which does not require adjustment of weights.•A method is also suggested for segmenting noisy speech data. This study investigates the use of non-conventional body-conductive acoustic sensors in human-human speech communication and automatic speech recognition. The body-conductive sensors are directly attached to the speaker and receive the uttered speech through the skin and bones, resulting in higher robustness against environmental noise. In this study, a throat microphone, an ear bone microphone, and a standard microphone were evaluated using subjective speech intelligibility tests and automatic speech recognition experiments. In addition to the use of these sensors on their own, several methods were also applied for sensor integration, thereby achieving higher recognition rates. Namely, multi-stream hidden Markov model (HMM) decision fusion, and late fusion methods were used to integrate several sensors. By using late fusion, a 40% relative recognition rate improvement in a noisy environment, and a 24% relative recognition rate improvement in a clean environment were achieved. In the case of late fusion, a novel adaptive weighting method was introduced that does not require any pre-adjustment of the weights. In this study, a technique to automatically segment noisy speech data by using a body-conductive sensor in conjunction with the desired microphone during recording is presented. The Lombard effect phenomenon when using body-conductive acoustic sensors was also investigated.
ISSN:	0167-8655 1872-7344
DOI:	10.1016/j.patrec.2018.07.014