An evaluation of the automatically topology-generated auto-regressive hidden Markov model with regard to an esophageal voice enhancement task

An Auto-Regressive eXogenous (ARX) model combined with descriptive models of the glottal source waveform has been adopted to more accurately separate the vocal tract and the voicing source. However, these methods cannot be easily applied to the analysis of voices uttered by different speech producti...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of the Acoustical Society of America 2016-10, Vol.140 (4), p.2964-2964
1. Verfasser: Sasou, Akira
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:An Auto-Regressive eXogenous (ARX) model combined with descriptive models of the glottal source waveform has been adopted to more accurately separate the vocal tract and the voicing source. However, these methods cannot be easily applied to the analysis of voices uttered by different speech production methods, such as esophageal voice. We previously proposed the voicing source hidden Markov model and an accompanying parameter estimation method. We refer to the model combining the HMM with an Auto-Regressive (AR) filter as AR-HMM. The proposed method automatically generates the optimum topology for the HMM using the minimum description length-based successive state splitting algorithm in order to simultaneously and accurately estimate the vocal tract and voicing source based on a voice excited by an unknown, aperiodic voicing source such as an esophageal voice. In this paper, we evaluate the perceived quality of the enhanced esophageal voices, which are synthesized by filtering the voicing source extracted from normal speakers with the AR filter extracted from the esophageal voices.
ISSN:0001-4966
1520-8524
DOI:10.1121/1.4969164