Parallel implementation of artificial neural network training

In this paper we describe the implementation of a complete ANN training procedure for speech recognition using the block mode back-propagation learning algorithm. We exploit the high performance SIMD architecture of GPU using CUDA and its C-like language interface. We also compare the speed-up obtai...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Scanzio, Stefano, Cumani, Sandro, Gemello, Roberto, Mana, Franco, Laface, P
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Acoustic testing Artificial Neural Network Artificial neural networks CUDA Fast Training Feedforward systems Focused Attention Back-Propagation GPU Hidden Markov models Libraries Matrix converters Multicore processing Speech recognition State estimation Vocabulary
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In this paper we describe the implementation of a complete ANN training procedure for speech recognition using the block mode back-propagation learning algorithm. We exploit the high performance SIMD architecture of GPU using CUDA and its C-like language interface. We also compare the speed-up obtained implementing the training procedure only taking advantage of the multi-thread capabilities of multi-core processors. Our approach has been tested by training acoustic models for large vocabulary speech recognition tasks, showing a 6 times reduction of the time required to train real-world large size networks with respect to an already optimized implementation using the Intel MKL libraries.
ISSN:	1520-6149 2379-190X
DOI:	10.1109/ICASSP.2010.5495108