Enabling improved speaker recognition by voice quality estimation
Presented is a method to mitigate noise and interference in automated speaker identification (SID). This process uses the MIT/LL SID module without modifications. In this process, speaker models are built for a lattice of signal to noise ratio (SNR) levels. The SNR of the received signal is estimate...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Presented is a method to mitigate noise and interference in automated speaker identification (SID). This process uses the MIT/LL SID module without modifications. In this process, speaker models are built for a lattice of signal to noise ratio (SNR) levels. The SNR of the received signal is estimated by first applying speech activity detection to identify portions of the signal that actually contain speech. A voice quality estimation process is then applied to estimate the SNR of the received signal. The speaker models representing the SNR of the received signal are dynamically loaded, and conventional SID is applied. In training, the SNR of each training signal is estimated, and the signal is modified by adding noise to create a signal at the desired SNR. Using this process, each signal may be used to train models at any SNR level less than or equal to the SNR of the original signal. The process has been fully implemented and is completely automated. |
---|---|
ISSN: | 1058-6393 2576-2303 |
DOI: | 10.1109/ACSSC.2011.6190071 |