Memory Efficient Meta-Learning with Large Images
35th Conference on Neural Information Processing Systems (NeurIPS 2021) Meta learning approaches to few-shot classification are computationally efficient at test time, requiring just a few optimization steps or single forward pass to learn a new task, but they remain highly memory-intensive to train...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Bronskill, John Massiceti, Daniela Patacchiola, Massimiliano Hofmann, Katja Nowozin, Sebastian Turner, Richard E |
description | 35th Conference on Neural Information Processing Systems (NeurIPS
2021) Meta learning approaches to few-shot classification are computationally
efficient at test time, requiring just a few optimization steps or single
forward pass to learn a new task, but they remain highly memory-intensive to
train. This limitation arises because a task's entire support set, which can
contain up to 1000 images, must be processed before an optimization step can be
taken. Harnessing the performance gains offered by large images thus requires
either parallelizing the meta-learner across multiple GPUs, which may not be
available, or trade-offs between task and image size when memory constraints
apply. We improve on both options by proposing LITE, a general and memory
efficient episodic training scheme that enables meta-training on large tasks
composed of large images on a single GPU. We achieve this by observing that the
gradients for a task can be decomposed into a sum of gradients over the task's
training images. This enables us to perform a forward pass on a task's entire
training set but realize significant memory savings by back-propagating only a
random subset of these images which we show is an unbiased approximation of the
full gradient. We use LITE to train meta-learners and demonstrate new
state-of-the-art accuracy on the real-world ORBIT benchmark and 3 of the 4
parts of the challenging VTAB+MD benchmark relative to leading meta-learners.
LITE also enables meta-learners to be competitive with transfer learning
approaches but at a fraction of the test-time computational cost, thus serving
as a counterpoint to the recent narrative that transfer learning is all you
need for few-shot classification. |
doi_str_mv | 10.48550/arxiv.2107.01105 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2107_01105</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2107_01105</sourcerecordid><originalsourceid>FETCH-LOGICAL-a675-51c3d890e0532834a402219773ba805389d13a60af35055814f78cffcc4ea4b3</originalsourceid><addsrcrecordid>eNotzstuwjAQhWFvWCDgAVjVL5B0HHuws0SImxTEgu6jwYxTS02oTMTl7WlpV0f6F0efEFMFuXGI8E7pHq95ocDmoBTgUMCO23N6yGUI0UfuernjnrKKKXWxa-Qt9p-yotSw3LbU8GUsBoG-Ljz535E4rJYfi01W7dfbxbzKaGYxQ-X1yZXAgLpw2pCBolCltfpI7qe58qQ0zYCCRkB0ygTrfAjeGyZz1CPx9vf6EtffKbaUHvWvvH7J9RMzNjv0</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Memory Efficient Meta-Learning with Large Images</title><source>arXiv.org</source><creator>Bronskill, John ; Massiceti, Daniela ; Patacchiola, Massimiliano ; Hofmann, Katja ; Nowozin, Sebastian ; Turner, Richard E</creator><creatorcontrib>Bronskill, John ; Massiceti, Daniela ; Patacchiola, Massimiliano ; Hofmann, Katja ; Nowozin, Sebastian ; Turner, Richard E</creatorcontrib><description>35th Conference on Neural Information Processing Systems (NeurIPS
2021) Meta learning approaches to few-shot classification are computationally
efficient at test time, requiring just a few optimization steps or single
forward pass to learn a new task, but they remain highly memory-intensive to
train. This limitation arises because a task's entire support set, which can
contain up to 1000 images, must be processed before an optimization step can be
taken. Harnessing the performance gains offered by large images thus requires
either parallelizing the meta-learner across multiple GPUs, which may not be
available, or trade-offs between task and image size when memory constraints
apply. We improve on both options by proposing LITE, a general and memory
efficient episodic training scheme that enables meta-training on large tasks
composed of large images on a single GPU. We achieve this by observing that the
gradients for a task can be decomposed into a sum of gradients over the task's
training images. This enables us to perform a forward pass on a task's entire
training set but realize significant memory savings by back-propagating only a
random subset of these images which we show is an unbiased approximation of the
full gradient. We use LITE to train meta-learners and demonstrate new
state-of-the-art accuracy on the real-world ORBIT benchmark and 3 of the 4
parts of the challenging VTAB+MD benchmark relative to leading meta-learners.
LITE also enables meta-learners to be competitive with transfer learning
approaches but at a fraction of the test-time computational cost, thus serving
as a counterpoint to the recent narrative that transfer learning is all you
need for few-shot classification.</description><identifier>DOI: 10.48550/arxiv.2107.01105</identifier><language>eng</language><subject>Computer Science - Learning ; Statistics - Machine Learning</subject><creationdate>2021-07</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2107.01105$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2107.01105$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Bronskill, John</creatorcontrib><creatorcontrib>Massiceti, Daniela</creatorcontrib><creatorcontrib>Patacchiola, Massimiliano</creatorcontrib><creatorcontrib>Hofmann, Katja</creatorcontrib><creatorcontrib>Nowozin, Sebastian</creatorcontrib><creatorcontrib>Turner, Richard E</creatorcontrib><title>Memory Efficient Meta-Learning with Large Images</title><description>35th Conference on Neural Information Processing Systems (NeurIPS
2021) Meta learning approaches to few-shot classification are computationally
efficient at test time, requiring just a few optimization steps or single
forward pass to learn a new task, but they remain highly memory-intensive to
train. This limitation arises because a task's entire support set, which can
contain up to 1000 images, must be processed before an optimization step can be
taken. Harnessing the performance gains offered by large images thus requires
either parallelizing the meta-learner across multiple GPUs, which may not be
available, or trade-offs between task and image size when memory constraints
apply. We improve on both options by proposing LITE, a general and memory
efficient episodic training scheme that enables meta-training on large tasks
composed of large images on a single GPU. We achieve this by observing that the
gradients for a task can be decomposed into a sum of gradients over the task's
training images. This enables us to perform a forward pass on a task's entire
training set but realize significant memory savings by back-propagating only a
random subset of these images which we show is an unbiased approximation of the
full gradient. We use LITE to train meta-learners and demonstrate new
state-of-the-art accuracy on the real-world ORBIT benchmark and 3 of the 4
parts of the challenging VTAB+MD benchmark relative to leading meta-learners.
LITE also enables meta-learners to be competitive with transfer learning
approaches but at a fraction of the test-time computational cost, thus serving
as a counterpoint to the recent narrative that transfer learning is all you
need for few-shot classification.</description><subject>Computer Science - Learning</subject><subject>Statistics - Machine Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotzstuwjAQhWFvWCDgAVjVL5B0HHuws0SImxTEgu6jwYxTS02oTMTl7WlpV0f6F0efEFMFuXGI8E7pHq95ocDmoBTgUMCO23N6yGUI0UfuernjnrKKKXWxa-Qt9p-yotSw3LbU8GUsBoG-Ljz535E4rJYfi01W7dfbxbzKaGYxQ-X1yZXAgLpw2pCBolCltfpI7qe58qQ0zYCCRkB0ygTrfAjeGyZz1CPx9vf6EtffKbaUHvWvvH7J9RMzNjv0</recordid><startdate>20210702</startdate><enddate>20210702</enddate><creator>Bronskill, John</creator><creator>Massiceti, Daniela</creator><creator>Patacchiola, Massimiliano</creator><creator>Hofmann, Katja</creator><creator>Nowozin, Sebastian</creator><creator>Turner, Richard E</creator><scope>AKY</scope><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20210702</creationdate><title>Memory Efficient Meta-Learning with Large Images</title><author>Bronskill, John ; Massiceti, Daniela ; Patacchiola, Massimiliano ; Hofmann, Katja ; Nowozin, Sebastian ; Turner, Richard E</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a675-51c3d890e0532834a402219773ba805389d13a60af35055814f78cffcc4ea4b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Computer Science - Learning</topic><topic>Statistics - Machine Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Bronskill, John</creatorcontrib><creatorcontrib>Massiceti, Daniela</creatorcontrib><creatorcontrib>Patacchiola, Massimiliano</creatorcontrib><creatorcontrib>Hofmann, Katja</creatorcontrib><creatorcontrib>Nowozin, Sebastian</creatorcontrib><creatorcontrib>Turner, Richard E</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Statistics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Bronskill, John</au><au>Massiceti, Daniela</au><au>Patacchiola, Massimiliano</au><au>Hofmann, Katja</au><au>Nowozin, Sebastian</au><au>Turner, Richard E</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Memory Efficient Meta-Learning with Large Images</atitle><date>2021-07-02</date><risdate>2021</risdate><abstract>35th Conference on Neural Information Processing Systems (NeurIPS
2021) Meta learning approaches to few-shot classification are computationally
efficient at test time, requiring just a few optimization steps or single
forward pass to learn a new task, but they remain highly memory-intensive to
train. This limitation arises because a task's entire support set, which can
contain up to 1000 images, must be processed before an optimization step can be
taken. Harnessing the performance gains offered by large images thus requires
either parallelizing the meta-learner across multiple GPUs, which may not be
available, or trade-offs between task and image size when memory constraints
apply. We improve on both options by proposing LITE, a general and memory
efficient episodic training scheme that enables meta-training on large tasks
composed of large images on a single GPU. We achieve this by observing that the
gradients for a task can be decomposed into a sum of gradients over the task's
training images. This enables us to perform a forward pass on a task's entire
training set but realize significant memory savings by back-propagating only a
random subset of these images which we show is an unbiased approximation of the
full gradient. We use LITE to train meta-learners and demonstrate new
state-of-the-art accuracy on the real-world ORBIT benchmark and 3 of the 4
parts of the challenging VTAB+MD benchmark relative to leading meta-learners.
LITE also enables meta-learners to be competitive with transfer learning
approaches but at a fraction of the test-time computational cost, thus serving
as a counterpoint to the recent narrative that transfer learning is all you
need for few-shot classification.</abstract><doi>10.48550/arxiv.2107.01105</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2107.01105 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2107_01105 |
source | arXiv.org |
subjects | Computer Science - Learning Statistics - Machine Learning |
title | Memory Efficient Meta-Learning with Large Images |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T07%3A12%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Memory%20Efficient%20Meta-Learning%20with%20Large%20Images&rft.au=Bronskill,%20John&rft.date=2021-07-02&rft_id=info:doi/10.48550/arxiv.2107.01105&rft_dat=%3Carxiv_GOX%3E2107_01105%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |