See Further for Parameter Efficient Fine-tuning by Standing on the Shoulders of Decomposition
The rapid expansion of large foundation models within the pre-training and fine-tuning framework has underscored that larger models often yield better results. However, the scaling up of large foundation models has led to soaring costs in fine-tuning and parameter storage, rendering extensive adapta...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Si, Chongjie Yang, Xiaokang Shen, Wei |
description | The rapid expansion of large foundation models within the pre-training and
fine-tuning framework has underscored that larger models often yield better
results. However, the scaling up of large foundation models has led to soaring
costs in fine-tuning and parameter storage, rendering extensive adaptations
impractical. This challenge has sparked the development of parameter-efficient
fine-tuning (PEFT), which focuses on optimizing a select subset of parameters
while keeping the rest fixed, significantly lowering computational and storage
overheads. While recent years have witnessed a significant success in PEFT, a
deep understanding of the fundamental principles behind these methods remains
unexplored. To this end, here we take the first step to unify all approaches by
dissecting them from a decomposition perspective. We initiate a comprehensive
mathematical analysis of these methods, allowing us to delve deeply into their
underlying mechanisms, and we explore the reasons behind the variations in
performance among different techniques. Furthermore, inspired by our
theoretical analysis, we introduce two novel PEFT methods alongside a simple
yet effective framework designed to enhance the performance of PEFT techniques
across various applications. Our empirical validations, conducted across
multiple datasets, demonstrate the efficacy of these methods, showcasing both
theoretical validity and practical performance improvements under the guidance
of our analytical findings. We believe our work will deepen researchers'
understanding of PEFT and other techniques, prompting further contemplation and
advancing the research across the whole community. |
doi_str_mv | 10.48550/arxiv.2407.05417 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2407_05417</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2407_05417</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2407_054173</originalsourceid><addsrcrecordid>eNqFjrEOgkAQRK-xMOoHWLk_AB4KwV4hlibYGnLCnmwCd2Q5jPy9QOytZiZ5kzwhtoH0w1MUyb3iD739QyhjX0ZhEC_FI0OEtGdXIYO2DDfFqkE3rkRrKgiNg5QMeq43ZF7wHCBzypRTtwbGH2SV7esSuQOr4YKFbVrbkSNr1mKhVd3h5pcrsUuT-_nqzSJ5y9QoHvJJKJ-Fjv-JL1-xQT0</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>See Further for Parameter Efficient Fine-tuning by Standing on the Shoulders of Decomposition</title><source>arXiv.org</source><creator>Si, Chongjie ; Yang, Xiaokang ; Shen, Wei</creator><creatorcontrib>Si, Chongjie ; Yang, Xiaokang ; Shen, Wei</creatorcontrib><description>The rapid expansion of large foundation models within the pre-training and
fine-tuning framework has underscored that larger models often yield better
results. However, the scaling up of large foundation models has led to soaring
costs in fine-tuning and parameter storage, rendering extensive adaptations
impractical. This challenge has sparked the development of parameter-efficient
fine-tuning (PEFT), which focuses on optimizing a select subset of parameters
while keeping the rest fixed, significantly lowering computational and storage
overheads. While recent years have witnessed a significant success in PEFT, a
deep understanding of the fundamental principles behind these methods remains
unexplored. To this end, here we take the first step to unify all approaches by
dissecting them from a decomposition perspective. We initiate a comprehensive
mathematical analysis of these methods, allowing us to delve deeply into their
underlying mechanisms, and we explore the reasons behind the variations in
performance among different techniques. Furthermore, inspired by our
theoretical analysis, we introduce two novel PEFT methods alongside a simple
yet effective framework designed to enhance the performance of PEFT techniques
across various applications. Our empirical validations, conducted across
multiple datasets, demonstrate the efficacy of these methods, showcasing both
theoretical validity and practical performance improvements under the guidance
of our analytical findings. We believe our work will deepen researchers'
understanding of PEFT and other techniques, prompting further contemplation and
advancing the research across the whole community.</description><identifier>DOI: 10.48550/arxiv.2407.05417</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Learning</subject><creationdate>2024-07</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2407.05417$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2407.05417$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Si, Chongjie</creatorcontrib><creatorcontrib>Yang, Xiaokang</creatorcontrib><creatorcontrib>Shen, Wei</creatorcontrib><title>See Further for Parameter Efficient Fine-tuning by Standing on the Shoulders of Decomposition</title><description>The rapid expansion of large foundation models within the pre-training and
fine-tuning framework has underscored that larger models often yield better
results. However, the scaling up of large foundation models has led to soaring
costs in fine-tuning and parameter storage, rendering extensive adaptations
impractical. This challenge has sparked the development of parameter-efficient
fine-tuning (PEFT), which focuses on optimizing a select subset of parameters
while keeping the rest fixed, significantly lowering computational and storage
overheads. While recent years have witnessed a significant success in PEFT, a
deep understanding of the fundamental principles behind these methods remains
unexplored. To this end, here we take the first step to unify all approaches by
dissecting them from a decomposition perspective. We initiate a comprehensive
mathematical analysis of these methods, allowing us to delve deeply into their
underlying mechanisms, and we explore the reasons behind the variations in
performance among different techniques. Furthermore, inspired by our
theoretical analysis, we introduce two novel PEFT methods alongside a simple
yet effective framework designed to enhance the performance of PEFT techniques
across various applications. Our empirical validations, conducted across
multiple datasets, demonstrate the efficacy of these methods, showcasing both
theoretical validity and practical performance improvements under the guidance
of our analytical findings. We believe our work will deepen researchers'
understanding of PEFT and other techniques, prompting further contemplation and
advancing the research across the whole community.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNqFjrEOgkAQRK-xMOoHWLk_AB4KwV4hlibYGnLCnmwCd2Q5jPy9QOytZiZ5kzwhtoH0w1MUyb3iD739QyhjX0ZhEC_FI0OEtGdXIYO2DDfFqkE3rkRrKgiNg5QMeq43ZF7wHCBzypRTtwbGH2SV7esSuQOr4YKFbVrbkSNr1mKhVd3h5pcrsUuT-_nqzSJ5y9QoHvJJKJ-Fjv-JL1-xQT0</recordid><startdate>20240707</startdate><enddate>20240707</enddate><creator>Si, Chongjie</creator><creator>Yang, Xiaokang</creator><creator>Shen, Wei</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240707</creationdate><title>See Further for Parameter Efficient Fine-tuning by Standing on the Shoulders of Decomposition</title><author>Si, Chongjie ; Yang, Xiaokang ; Shen, Wei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2407_054173</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Si, Chongjie</creatorcontrib><creatorcontrib>Yang, Xiaokang</creatorcontrib><creatorcontrib>Shen, Wei</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Si, Chongjie</au><au>Yang, Xiaokang</au><au>Shen, Wei</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>See Further for Parameter Efficient Fine-tuning by Standing on the Shoulders of Decomposition</atitle><date>2024-07-07</date><risdate>2024</risdate><abstract>The rapid expansion of large foundation models within the pre-training and
fine-tuning framework has underscored that larger models often yield better
results. However, the scaling up of large foundation models has led to soaring
costs in fine-tuning and parameter storage, rendering extensive adaptations
impractical. This challenge has sparked the development of parameter-efficient
fine-tuning (PEFT), which focuses on optimizing a select subset of parameters
while keeping the rest fixed, significantly lowering computational and storage
overheads. While recent years have witnessed a significant success in PEFT, a
deep understanding of the fundamental principles behind these methods remains
unexplored. To this end, here we take the first step to unify all approaches by
dissecting them from a decomposition perspective. We initiate a comprehensive
mathematical analysis of these methods, allowing us to delve deeply into their
underlying mechanisms, and we explore the reasons behind the variations in
performance among different techniques. Furthermore, inspired by our
theoretical analysis, we introduce two novel PEFT methods alongside a simple
yet effective framework designed to enhance the performance of PEFT techniques
across various applications. Our empirical validations, conducted across
multiple datasets, demonstrate the efficacy of these methods, showcasing both
theoretical validity and practical performance improvements under the guidance
of our analytical findings. We believe our work will deepen researchers'
understanding of PEFT and other techniques, prompting further contemplation and
advancing the research across the whole community.</abstract><doi>10.48550/arxiv.2407.05417</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2407.05417 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2407_05417 |
source | arXiv.org |
subjects | Computer Science - Artificial Intelligence Computer Science - Computer Vision and Pattern Recognition Computer Science - Learning |
title | See Further for Parameter Efficient Fine-tuning by Standing on the Shoulders of Decomposition |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T14%3A16%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=See%20Further%20for%20Parameter%20Efficient%20Fine-tuning%20by%20Standing%20on%20the%20Shoulders%20of%20Decomposition&rft.au=Si,%20Chongjie&rft.date=2024-07-07&rft_id=info:doi/10.48550/arxiv.2407.05417&rft_dat=%3Carxiv_GOX%3E2407_05417%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |