Fast Algorithms for Convolutional Neural Networks

Deep convolutional neural networks take GPU days of compute time to train on large data sets. Pedestrian detection for self driving cars requires very low latency. Image recognition for mobile phones is constrained by limited processing resources. The success of convolutional neural networks in thes...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2015-11
Hauptverfasser:	Lavin, Andrew, Gray, Scott
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Artificial neural networks Autonomous cars Batch processing Convolution Graphics processing units Image detection Neural networks Object recognition Railroad cars
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Lavin, Andrew Gray, Scott
description	Deep convolutional neural networks take GPU days of compute time to train on large data sets. Pedestrian detection for self driving cars requires very low latency. Image recognition for mobile phones is constrained by limited processing resources. The success of convolutional neural networks in these situations is limited by how fast we can compute them. Conventional FFT based convolution is fast for large filters, but state of the art convolutional neural networks use small, 3x3 filters. We introduce a new class of fast algorithms for convolutional neural networks using Winograd's minimal filtering algorithms. The algorithms compute minimal complexity convolution over small tiles, which makes them fast with small filters and small batch sizes. We benchmark a GPU implementation of our algorithm with the VGG network and show state of the art throughput at batch sizes from 1 to 64.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2084028757</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2084028757</sourcerecordid><originalsourceid>FETCH-proquest_journals_20840287573</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mQwdEssLlFwzEnPL8osycgtVkjLL1Jwzs8ry88pLcnMz0vMUfBLLS0CUyXl-UXZxTwMrGmJOcWpvFCam0HZzTXE2UO3oCi_sDS1uCQ-K7-0CKixON7IwMLEwMjC3NTcmDhVAMLFM8w</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2084028757</pqid></control><display><type>article</type><title>Fast Algorithms for Convolutional Neural Networks</title><source>Free E- Journals</source><creator>Lavin, Andrew ; Gray, Scott</creator><creatorcontrib>Lavin, Andrew ; Gray, Scott</creatorcontrib><description>Deep convolutional neural networks take GPU days of compute time to train on large data sets. Pedestrian detection for self driving cars requires very low latency. Image recognition for mobile phones is constrained by limited processing resources. The success of convolutional neural networks in these situations is limited by how fast we can compute them. Conventional FFT based convolution is fast for large filters, but state of the art convolutional neural networks use small, 3x3 filters. We introduce a new class of fast algorithms for convolutional neural networks using Winograd's minimal filtering algorithms. The algorithms compute minimal complexity convolution over small tiles, which makes them fast with small filters and small batch sizes. We benchmark a GPU implementation of our algorithm with the VGG network and show state of the art throughput at batch sizes from 1 to 64.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Artificial neural networks ; Autonomous cars ; Batch processing ; Convolution ; Graphics processing units ; Image detection ; Neural networks ; Object recognition ; Railroad cars</subject><ispartof>arXiv.org, 2015-11</ispartof><rights>2015. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Lavin, Andrew</creatorcontrib><creatorcontrib>Gray, Scott</creatorcontrib><title>Fast Algorithms for Convolutional Neural Networks</title><title>arXiv.org</title><description>Deep convolutional neural networks take GPU days of compute time to train on large data sets. Pedestrian detection for self driving cars requires very low latency. Image recognition for mobile phones is constrained by limited processing resources. The success of convolutional neural networks in these situations is limited by how fast we can compute them. Conventional FFT based convolution is fast for large filters, but state of the art convolutional neural networks use small, 3x3 filters. We introduce a new class of fast algorithms for convolutional neural networks using Winograd's minimal filtering algorithms. The algorithms compute minimal complexity convolution over small tiles, which makes them fast with small filters and small batch sizes. We benchmark a GPU implementation of our algorithm with the VGG network and show state of the art throughput at batch sizes from 1 to 64.</description><subject>Algorithms</subject><subject>Artificial neural networks</subject><subject>Autonomous cars</subject><subject>Batch processing</subject><subject>Convolution</subject><subject>Graphics processing units</subject><subject>Image detection</subject><subject>Neural networks</subject><subject>Object recognition</subject><subject>Railroad cars</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mQwdEssLlFwzEnPL8osycgtVkjLL1Jwzs8ry88pLcnMz0vMUfBLLS0CUyXl-UXZxTwMrGmJOcWpvFCam0HZzTXE2UO3oCi_sDS1uCQ-K7-0CKixON7IwMLEwMjC3NTcmDhVAMLFM8w</recordid><startdate>20151110</startdate><enddate>20151110</enddate><creator>Lavin, Andrew</creator><creator>Gray, Scott</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20151110</creationdate><title>Fast Algorithms for Convolutional Neural Networks</title><author>Lavin, Andrew ; Gray, Scott</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_20840287573</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Algorithms</topic><topic>Artificial neural networks</topic><topic>Autonomous cars</topic><topic>Batch processing</topic><topic>Convolution</topic><topic>Graphics processing units</topic><topic>Image detection</topic><topic>Neural networks</topic><topic>Object recognition</topic><topic>Railroad cars</topic><toplevel>online_resources</toplevel><creatorcontrib>Lavin, Andrew</creatorcontrib><creatorcontrib>Gray, Scott</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lavin, Andrew</au><au>Gray, Scott</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Fast Algorithms for Convolutional Neural Networks</atitle><jtitle>arXiv.org</jtitle><date>2015-11-10</date><risdate>2015</risdate><eissn>2331-8422</eissn><abstract>Deep convolutional neural networks take GPU days of compute time to train on large data sets. Pedestrian detection for self driving cars requires very low latency. Image recognition for mobile phones is constrained by limited processing resources. The success of convolutional neural networks in these situations is limited by how fast we can compute them. Conventional FFT based convolution is fast for large filters, but state of the art convolutional neural networks use small, 3x3 filters. We introduce a new class of fast algorithms for convolutional neural networks using Winograd's minimal filtering algorithms. The algorithms compute minimal complexity convolution over small tiles, which makes them fast with small filters and small batch sizes. We benchmark a GPU implementation of our algorithm with the VGG network and show state of the art throughput at batch sizes from 1 to 64.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2015-11
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2084028757
source	Free E- Journals
subjects	Algorithms Artificial neural networks Autonomous cars Batch processing Convolution Graphics processing units Image detection Neural networks Object recognition Railroad cars
title	Fast Algorithms for Convolutional Neural Networks
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T01%3A17%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Fast%20Algorithms%20for%20Convolutional%20Neural%20Networks&rft.jtitle=arXiv.org&rft.au=Lavin,%20Andrew&rft.date=2015-11-10&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2084028757%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2084028757&rft_id=info:pmid/&rfr_iscdi=true