Adaptive Transfer Learning: a simple but effective transfer learning

Transfer learning (TL) leverages previously obtained knowledge to learn new tasks efficiently and has been used to train deep learning (DL) models with limited amount of data. When TL is applied to DL, pretrained (teacher) models are fine-tuned to build domain specific (student) models. This fine-tu...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Lee, Jung H, Kvinge, Henry J, Howland, Scott, New, Zachary, Buckheit, John, Phillips, Lauren A, Skomski, Elliott, Hibler, Jessica, Corley, Courtney D, Hodas, Nathan O
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computer Vision and Pattern Recognition Computer Science - Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Lee, Jung H Kvinge, Henry J Howland, Scott New, Zachary Buckheit, John Phillips, Lauren A Skomski, Elliott Hibler, Jessica Corley, Courtney D Hodas, Nathan O
description	Transfer learning (TL) leverages previously obtained knowledge to learn new tasks efficiently and has been used to train deep learning (DL) models with limited amount of data. When TL is applied to DL, pretrained (teacher) models are fine-tuned to build domain specific (student) models. This fine-tuning relies on the fact that DL model can be decomposed to classifiers and feature extractors, and a line of studies showed that the same feature extractors can be used to train classifiers on multiple tasks. Furthermore, recent studies proposed multiple algorithms that can fine-tune teacher models' feature extractors to train student models more efficiently. We note that regardless of the fine-tuning of feature extractors, the classifiers of student models are trained with final outputs of feature extractors (i.e., the outputs of penultimate layers). However, a recent study suggested that feature maps in ResNets across layers could be functionally equivalent, raising the possibility that feature maps inside the feature extractors can also be used to train student models' classifiers. Inspired by this study, we tested if feature maps in the hidden layers of the teacher models can be used to improve the student models' accuracy (i.e., TL's efficiency). Specifically, we developed 'adaptive transfer learning (ATL)', which can choose an optimal set of feature maps for TL, and tested it in the few-shot learning setting. Our empirical evaluations suggest that ATL can help DL models learn more efficiently, especially when available examples are limited.
doi_str_mv	10.48550/arxiv.2111.10937
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2111_10937</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2111_10937</sourcerecordid><originalsourceid>FETCH-LOGICAL-a677-f2384882c65e708e977b5407529165697155ef9adf5e0943b2fa13267e4a58bf3</originalsourceid><addsrcrecordid>eNo1z8tOwzAQhWFvWKDCA7DCL5Dg23hsdlW5SpG6yT6atGNkKY0iJ1Tw9ohQVmfz6Ui_EHda1S4AqAcqX_lcG611rVW0eC2etkealnxm2RYa58RFNkxlzOPHoyQ559M0sOw_F8kp8WGVy78cLvJGXCUaZr697Ea0L8_t7q1q9q_vu21TkUeskrHBhWAOHhhV4IjYg1MIJmoPPqIG4BTpmIBVdLY3ibQ1HtkRhD7Zjbj_u10zuqnkE5Xv7jenW3PsDwEMREM</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Adaptive Transfer Learning: a simple but effective transfer learning</title><source>arXiv.org</source><creator>Lee, Jung H ; Kvinge, Henry J ; Howland, Scott ; New, Zachary ; Buckheit, John ; Phillips, Lauren A ; Skomski, Elliott ; Hibler, Jessica ; Corley, Courtney D ; Hodas, Nathan O</creator><creatorcontrib>Lee, Jung H ; Kvinge, Henry J ; Howland, Scott ; New, Zachary ; Buckheit, John ; Phillips, Lauren A ; Skomski, Elliott ; Hibler, Jessica ; Corley, Courtney D ; Hodas, Nathan O</creatorcontrib><description>Transfer learning (TL) leverages previously obtained knowledge to learn new tasks efficiently and has been used to train deep learning (DL) models with limited amount of data. When TL is applied to DL, pretrained (teacher) models are fine-tuned to build domain specific (student) models. This fine-tuning relies on the fact that DL model can be decomposed to classifiers and feature extractors, and a line of studies showed that the same feature extractors can be used to train classifiers on multiple tasks. Furthermore, recent studies proposed multiple algorithms that can fine-tune teacher models' feature extractors to train student models more efficiently. We note that regardless of the fine-tuning of feature extractors, the classifiers of student models are trained with final outputs of feature extractors (i.e., the outputs of penultimate layers). However, a recent study suggested that feature maps in ResNets across layers could be functionally equivalent, raising the possibility that feature maps inside the feature extractors can also be used to train student models' classifiers. Inspired by this study, we tested if feature maps in the hidden layers of the teacher models can be used to improve the student models' accuracy (i.e., TL's efficiency). Specifically, we developed 'adaptive transfer learning (ATL)', which can choose an optimal set of feature maps for TL, and tested it in the few-shot learning setting. Our empirical evaluations suggest that ATL can help DL models learn more efficiently, especially when available examples are limited.</description><identifier>DOI: 10.48550/arxiv.2111.10937</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Learning</subject><creationdate>2021-11</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2111.10937$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2111.10937$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Lee, Jung H</creatorcontrib><creatorcontrib>Kvinge, Henry J</creatorcontrib><creatorcontrib>Howland, Scott</creatorcontrib><creatorcontrib>New, Zachary</creatorcontrib><creatorcontrib>Buckheit, John</creatorcontrib><creatorcontrib>Phillips, Lauren A</creatorcontrib><creatorcontrib>Skomski, Elliott</creatorcontrib><creatorcontrib>Hibler, Jessica</creatorcontrib><creatorcontrib>Corley, Courtney D</creatorcontrib><creatorcontrib>Hodas, Nathan O</creatorcontrib><title>Adaptive Transfer Learning: a simple but effective transfer learning</title><description>Transfer learning (TL) leverages previously obtained knowledge to learn new tasks efficiently and has been used to train deep learning (DL) models with limited amount of data. When TL is applied to DL, pretrained (teacher) models are fine-tuned to build domain specific (student) models. This fine-tuning relies on the fact that DL model can be decomposed to classifiers and feature extractors, and a line of studies showed that the same feature extractors can be used to train classifiers on multiple tasks. Furthermore, recent studies proposed multiple algorithms that can fine-tune teacher models' feature extractors to train student models more efficiently. We note that regardless of the fine-tuning of feature extractors, the classifiers of student models are trained with final outputs of feature extractors (i.e., the outputs of penultimate layers). However, a recent study suggested that feature maps in ResNets across layers could be functionally equivalent, raising the possibility that feature maps inside the feature extractors can also be used to train student models' classifiers. Inspired by this study, we tested if feature maps in the hidden layers of the teacher models can be used to improve the student models' accuracy (i.e., TL's efficiency). Specifically, we developed 'adaptive transfer learning (ATL)', which can choose an optimal set of feature maps for TL, and tested it in the few-shot learning setting. Our empirical evaluations suggest that ATL can help DL models learn more efficiently, especially when available examples are limited.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNo1z8tOwzAQhWFvWKDCA7DCL5Dg23hsdlW5SpG6yT6atGNkKY0iJ1Tw9ohQVmfz6Ui_EHda1S4AqAcqX_lcG611rVW0eC2etkealnxm2RYa58RFNkxlzOPHoyQ559M0sOw_F8kp8WGVy78cLvJGXCUaZr697Ea0L8_t7q1q9q_vu21TkUeskrHBhWAOHhhV4IjYg1MIJmoPPqIG4BTpmIBVdLY3ibQ1HtkRhD7Zjbj_u10zuqnkE5Xv7jenW3PsDwEMREM</recordid><startdate>20211121</startdate><enddate>20211121</enddate><creator>Lee, Jung H</creator><creator>Kvinge, Henry J</creator><creator>Howland, Scott</creator><creator>New, Zachary</creator><creator>Buckheit, John</creator><creator>Phillips, Lauren A</creator><creator>Skomski, Elliott</creator><creator>Hibler, Jessica</creator><creator>Corley, Courtney D</creator><creator>Hodas, Nathan O</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20211121</creationdate><title>Adaptive Transfer Learning: a simple but effective transfer learning</title><author>Lee, Jung H ; Kvinge, Henry J ; Howland, Scott ; New, Zachary ; Buckheit, John ; Phillips, Lauren A ; Skomski, Elliott ; Hibler, Jessica ; Corley, Courtney D ; Hodas, Nathan O</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a677-f2384882c65e708e977b5407529165697155ef9adf5e0943b2fa13267e4a58bf3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Lee, Jung H</creatorcontrib><creatorcontrib>Kvinge, Henry J</creatorcontrib><creatorcontrib>Howland, Scott</creatorcontrib><creatorcontrib>New, Zachary</creatorcontrib><creatorcontrib>Buckheit, John</creatorcontrib><creatorcontrib>Phillips, Lauren A</creatorcontrib><creatorcontrib>Skomski, Elliott</creatorcontrib><creatorcontrib>Hibler, Jessica</creatorcontrib><creatorcontrib>Corley, Courtney D</creatorcontrib><creatorcontrib>Hodas, Nathan O</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Lee, Jung H</au><au>Kvinge, Henry J</au><au>Howland, Scott</au><au>New, Zachary</au><au>Buckheit, John</au><au>Phillips, Lauren A</au><au>Skomski, Elliott</au><au>Hibler, Jessica</au><au>Corley, Courtney D</au><au>Hodas, Nathan O</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Adaptive Transfer Learning: a simple but effective transfer learning</atitle><date>2021-11-21</date><risdate>2021</risdate><abstract>Transfer learning (TL) leverages previously obtained knowledge to learn new tasks efficiently and has been used to train deep learning (DL) models with limited amount of data. When TL is applied to DL, pretrained (teacher) models are fine-tuned to build domain specific (student) models. This fine-tuning relies on the fact that DL model can be decomposed to classifiers and feature extractors, and a line of studies showed that the same feature extractors can be used to train classifiers on multiple tasks. Furthermore, recent studies proposed multiple algorithms that can fine-tune teacher models' feature extractors to train student models more efficiently. We note that regardless of the fine-tuning of feature extractors, the classifiers of student models are trained with final outputs of feature extractors (i.e., the outputs of penultimate layers). However, a recent study suggested that feature maps in ResNets across layers could be functionally equivalent, raising the possibility that feature maps inside the feature extractors can also be used to train student models' classifiers. Inspired by this study, we tested if feature maps in the hidden layers of the teacher models can be used to improve the student models' accuracy (i.e., TL's efficiency). Specifically, we developed 'adaptive transfer learning (ATL)', which can choose an optimal set of feature maps for TL, and tested it in the few-shot learning setting. Our empirical evaluations suggest that ATL can help DL models learn more efficiently, especially when available examples are limited.</abstract><doi>10.48550/arxiv.2111.10937</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2111.10937
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2111_10937
source	arXiv.org
subjects	Computer Science - Computer Vision and Pattern Recognition Computer Science - Learning
title	Adaptive Transfer Learning: a simple but effective transfer learning
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T11%3A32%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Adaptive%20Transfer%20Learning:%20a%20simple%20but%20effective%20transfer%20learning&rft.au=Lee,%20Jung%20H&rft.date=2021-11-21&rft_id=info:doi/10.48550/arxiv.2111.10937&rft_dat=%3Carxiv_GOX%3E2111_10937%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true