Sideways: Depth-Parallel Training of Video Models
We propose Sideways, an approximate backpropagation scheme for training video models. In standard backpropagation, the gradients and activations at every computation step through the model are temporally synchronized. The forward activations need to be stored until the backward pass is executed, pre...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Malinowski, Mateusz Swirszcz, Grzegorz Carreira, Joao Patraucean, Viorica |
description | We propose Sideways, an approximate backpropagation scheme for training video
models. In standard backpropagation, the gradients and activations at every
computation step through the model are temporally synchronized. The forward
activations need to be stored until the backward pass is executed, preventing
inter-layer (depth) parallelization. However, can we leverage smooth, redundant
input streams such as videos to develop a more efficient training scheme? Here,
we explore an alternative to backpropagation; we overwrite network activations
whenever new ones, i.e., from new frames, become available. Such a more gradual
accumulation of information from both passes breaks the precise correspondence
between gradients and activations, leading to theoretically more noisy weight
updates. Counter-intuitively, we show that Sideways training of deep
convolutional video networks not only still converges, but can also potentially
exhibit better generalization compared to standard synchronized
backpropagation. |
doi_str_mv | 10.48550/arxiv.2001.06232 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2001_06232</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2001_06232</sourcerecordid><originalsourceid>FETCH-LOGICAL-a672-13546171eb645b7a91c8ade9b53cb7e3d08303cca5096b212226c1eed26a89113</originalsourceid><addsrcrecordid>eNotzkFOwzAQhWFvWKDCAVjhCyT1jGMnYYfaUpCKQCJiG43tKVhym8pBtL19obB6m19PnxA3oMqqMUZNKR_id4lKQaksarwU8BYD7-k43sk5774-i1fKlBIn2WWK27j9kMNavv9Eg3weAqfxSlysKY18_b8T0T0sutljsXpZPs3uVwXZGgvQprJQAztbGVdTC76hwK0z2ruadVCNVtp7Mqq1DgERrQfmgJaaFkBPxO3f7dnc73LcUD72v_b-bNcn7Xk9AA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Sideways: Depth-Parallel Training of Video Models</title><source>arXiv.org</source><creator>Malinowski, Mateusz ; Swirszcz, Grzegorz ; Carreira, Joao ; Patraucean, Viorica</creator><creatorcontrib>Malinowski, Mateusz ; Swirszcz, Grzegorz ; Carreira, Joao ; Patraucean, Viorica</creatorcontrib><description>We propose Sideways, an approximate backpropagation scheme for training video
models. In standard backpropagation, the gradients and activations at every
computation step through the model are temporally synchronized. The forward
activations need to be stored until the backward pass is executed, preventing
inter-layer (depth) parallelization. However, can we leverage smooth, redundant
input streams such as videos to develop a more efficient training scheme? Here,
we explore an alternative to backpropagation; we overwrite network activations
whenever new ones, i.e., from new frames, become available. Such a more gradual
accumulation of information from both passes breaks the precise correspondence
between gradients and activations, leading to theoretically more noisy weight
updates. Counter-intuitively, we show that Sideways training of deep
convolutional video networks not only still converges, but can also potentially
exhibit better generalization compared to standard synchronized
backpropagation.</description><identifier>DOI: 10.48550/arxiv.2001.06232</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Learning ; Statistics - Machine Learning</subject><creationdate>2020-01</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2001.06232$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2001.06232$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Malinowski, Mateusz</creatorcontrib><creatorcontrib>Swirszcz, Grzegorz</creatorcontrib><creatorcontrib>Carreira, Joao</creatorcontrib><creatorcontrib>Patraucean, Viorica</creatorcontrib><title>Sideways: Depth-Parallel Training of Video Models</title><description>We propose Sideways, an approximate backpropagation scheme for training video
models. In standard backpropagation, the gradients and activations at every
computation step through the model are temporally synchronized. The forward
activations need to be stored until the backward pass is executed, preventing
inter-layer (depth) parallelization. However, can we leverage smooth, redundant
input streams such as videos to develop a more efficient training scheme? Here,
we explore an alternative to backpropagation; we overwrite network activations
whenever new ones, i.e., from new frames, become available. Such a more gradual
accumulation of information from both passes breaks the precise correspondence
between gradients and activations, leading to theoretically more noisy weight
updates. Counter-intuitively, we show that Sideways training of deep
convolutional video networks not only still converges, but can also potentially
exhibit better generalization compared to standard synchronized
backpropagation.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Learning</subject><subject>Statistics - Machine Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotzkFOwzAQhWFvWKDCAVjhCyT1jGMnYYfaUpCKQCJiG43tKVhym8pBtL19obB6m19PnxA3oMqqMUZNKR_id4lKQaksarwU8BYD7-k43sk5774-i1fKlBIn2WWK27j9kMNavv9Eg3weAqfxSlysKY18_b8T0T0sutljsXpZPs3uVwXZGgvQprJQAztbGVdTC76hwK0z2ruadVCNVtp7Mqq1DgERrQfmgJaaFkBPxO3f7dnc73LcUD72v_b-bNcn7Xk9AA</recordid><startdate>20200117</startdate><enddate>20200117</enddate><creator>Malinowski, Mateusz</creator><creator>Swirszcz, Grzegorz</creator><creator>Carreira, Joao</creator><creator>Patraucean, Viorica</creator><scope>AKY</scope><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20200117</creationdate><title>Sideways: Depth-Parallel Training of Video Models</title><author>Malinowski, Mateusz ; Swirszcz, Grzegorz ; Carreira, Joao ; Patraucean, Viorica</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a672-13546171eb645b7a91c8ade9b53cb7e3d08303cca5096b212226c1eed26a89113</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Learning</topic><topic>Statistics - Machine Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Malinowski, Mateusz</creatorcontrib><creatorcontrib>Swirszcz, Grzegorz</creatorcontrib><creatorcontrib>Carreira, Joao</creatorcontrib><creatorcontrib>Patraucean, Viorica</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Statistics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Malinowski, Mateusz</au><au>Swirszcz, Grzegorz</au><au>Carreira, Joao</au><au>Patraucean, Viorica</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Sideways: Depth-Parallel Training of Video Models</atitle><date>2020-01-17</date><risdate>2020</risdate><abstract>We propose Sideways, an approximate backpropagation scheme for training video
models. In standard backpropagation, the gradients and activations at every
computation step through the model are temporally synchronized. The forward
activations need to be stored until the backward pass is executed, preventing
inter-layer (depth) parallelization. However, can we leverage smooth, redundant
input streams such as videos to develop a more efficient training scheme? Here,
we explore an alternative to backpropagation; we overwrite network activations
whenever new ones, i.e., from new frames, become available. Such a more gradual
accumulation of information from both passes breaks the precise correspondence
between gradients and activations, leading to theoretically more noisy weight
updates. Counter-intuitively, we show that Sideways training of deep
convolutional video networks not only still converges, but can also potentially
exhibit better generalization compared to standard synchronized
backpropagation.</abstract><doi>10.48550/arxiv.2001.06232</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2001.06232 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2001_06232 |
source | arXiv.org |
subjects | Computer Science - Computer Vision and Pattern Recognition Computer Science - Learning Statistics - Machine Learning |
title | Sideways: Depth-Parallel Training of Video Models |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-29T02%3A23%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Sideways:%20Depth-Parallel%20Training%20of%20Video%20Models&rft.au=Malinowski,%20Mateusz&rft.date=2020-01-17&rft_id=info:doi/10.48550/arxiv.2001.06232&rft_dat=%3Carxiv_GOX%3E2001_06232%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |