Enhancing Biometric Speaker Recognition Through MFCC Feature Extraction and Polar Codes for Remote Application

While extensive research has been conducted in the field of biometrics, particularly in face and fingerprint recognition, remote speaker recognition has yet to gain global acceptance due to challenges related to accuracy and data integrity. Previous studies in speaker recognition have explored techn...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE access 2023, Vol.11, p.133921-133930
Hauptverfasser:	Wankhede, Nilashree, Wagh, Sushama
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Artificial neural networks Authentication Biometric Biometric recognition systems Biometrics Bit error rate Channel noise Codes Data integrity Decoding Error correction Feature extraction Fingerprint recognition Fingerprint verification Integrity Mel frequency cepstral coefficient polar codes recognition rate Speaker recognition Speech recognition System effectiveness
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	While extensive research has been conducted in the field of biometrics, particularly in face and fingerprint recognition, remote speaker recognition has yet to gain global acceptance due to challenges related to accuracy and data integrity. Previous studies in speaker recognition have explored techniques such as Mel Frequency Cepstral Coefficients (MFCC) and Convolutional Neural Networks (CNN), yielding accuracy rates of 90.4% and 92.8%, respectively over a fixed and small database with a standalone system. To address the data integrity and accuracy issues for enhancement in remote speaker recognition, a novel approach is proposed in this paper. Initially, remote speaker recognition is implemented using a client-server setup, but the presence of channel noise hindered any noticeable improvement in accuracy compared to existing methods. The new approach involves extracting MFCC parameters from voice samples and subsequently applying polar error-correcting coding techniques for storage as well as transmission to achieve fidelity. Using a code rate of 1/2 and a block length of 1024 bits, the transmission of polar-coded MFCC features over a noisy channel yielded a lower bit error rate when coupled with successive list decoding. Simulation results demonstrate a reduction in bit error rate, resulting in an accuracy of 95.2% in the implemented remote speaker recognition system. This represents a significant 5% improvement over the existing standalone system that uses uncoded MFCC features. These findings highlight that the Polar codes can be effectively utilized in speaker recognition systems to enhance their robustness and reliability, especially in scenarios with noisy channels or challenging conditions.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2023.3333039