Fast Algorithms for Convolutional Neural Networks
Deep convolutional neural networks take GPU days of compute time to train on large data sets. Pedestrian detection for self driving cars requires very low latency. Image recognition for mobile phones is constrained by limited processing resources. The success of convolutional neural networks in thes...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Deep convolutional neural networks take GPU days of compute time to train on
large data sets. Pedestrian detection for self driving cars requires very low
latency. Image recognition for mobile phones is constrained by limited
processing resources. The success of convolutional neural networks in these
situations is limited by how fast we can compute them. Conventional FFT based
convolution is fast for large filters, but state of the art convolutional
neural networks use small, 3x3 filters. We introduce a new class of fast
algorithms for convolutional neural networks using Winograd's minimal filtering
algorithms. The algorithms compute minimal complexity convolution over small
tiles, which makes them fast with small filters and small batch sizes. We
benchmark a GPU implementation of our algorithm with the VGG network and show
state of the art throughput at batch sizes from 1 to 64. |
---|---|
DOI: | 10.48550/arxiv.1509.09308 |