Group Attack Dingo Optimizer for enhancing speech recognition in noisy environments
The speech recognition system has become a vital technology enabling seamless human–computer interactions, even in noisy public places. To enhance the performance of various applications like machine translation, natural language processing, spoken language understanding, and text generation, speech...
Gespeichert in:
Veröffentlicht in: | European physical journal plus 2023-12, Vol.138 (12), p.1145, Article 1145 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The speech recognition system has become a vital technology enabling seamless human–computer interactions, even in noisy public places. To enhance the performance of various applications like machine translation, natural language processing, spoken language understanding, and text generation, speech enhancement (SE) techniques play a crucial role. In this study, we introduce a novel approach termed (GA-DOA) for optimizing speech enhancement tasks. Our method combines an improved short-time Fourier transform (STFT) and an optimized deep U-Net, with GA-DOA used to fine-tune the parameters. Additionally, feature extraction employs Mel-frequency cepstral coefficients (MFCCs), spectral features, and one-dimensional convolutional neural networks (1D-CNN). To select the most effective features, we employ GA-DOA-assisted feature selection. These optimized features are then fed into our proposed hybrid model for speech recognition (HMSR), which integrates bidirectional long short-term memory (BiLSTM) with the gated recurrent unit (GRU). Experimental results reveal that our proposed model achieves superior recognition rates and significantly lowers the word error rate (WER), thereby demonstrating enhanced system performance, even in noisy environments. |
---|---|
ISSN: | 2190-5444 2190-5444 |
DOI: | 10.1140/epjp/s13360-023-04775-8 |