Glottal Activity Detection from the Speech Signal Using Multifractal Analysis
This work proposes a novel method for the detection of glottal activity regions from the speech signal. Glottal activity detection refers to the problem of discriminating voiced and unvoiced segments of the speech signal. This is a fundamental step in the work flow of many speech processing applicat...
Gespeichert in:
Veröffentlicht in: | Circuits, systems, and signal processing systems, and signal processing, 2020-04, Vol.39 (4), p.2118-2150 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This work proposes a novel method for the detection of glottal activity regions from the speech signal. Glottal activity detection refers to the problem of discriminating voiced and unvoiced segments of the speech signal. This is a fundamental step in the work flow of many speech processing applications. Much of the existing approaches for voiced/unvoiced detection are based on linear measures though the speech is produced from an underlying nonlinear process. The present work solves the problem from a nonlinear perspective, using the framework of multifractal analysis. The fractal property of the speech signal during the production of voiced and unvoiced sounds is sought to obtain the characterization of glottal activity. The characterization is done by computing the Hurst exponent from the evaluation of the scaling property of fluctuations present in the speech signal. Experimental analysis shows that Hurst exponent varies consistently with respect to the dynamics of glottal activity. The performance of the proposed method has been evaluated on the CMU-arctic, Keele and KED-Timit databases with simultaneous electroglottogram signals. Experimental results show that the average detection accuracy or error rate of the proposed method is comparable to the best performing algorithm on clean speech signals. Besides, evaluation of the robustness of the proposed method to noise degradation shows comparable results with other methods for signal-to-noise ratio greater than 10 dB and 20 dB, respectively, for white noise and babble noise. |
---|---|
ISSN: | 0278-081X 1531-5878 |
DOI: | 10.1007/s00034-019-01253-4 |