Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity

Understanding alignment techniques begins with comprehending zero-shot generalization brought by instruction tuning, but little of the mechanism has been understood. Existing work has largely been confined to the task level, without considering that tasks are artificially defined and, to LLMs, merel...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: He, Bingxiang, Ding, Ning, Qian, Cheng, Deng, Jia, Cui, Ganqu, Yuan, Lifan, Gao, Huan-ang, Chen, Huimin, Liu, Zhiyuan, Sun, Maosong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator He, Bingxiang
Ding, Ning
Qian, Cheng
Deng, Jia
Cui, Ganqu
Yuan, Lifan
Gao, Huan-ang
Chen, Huimin
Liu, Zhiyuan
Sun, Maosong
description Understanding alignment techniques begins with comprehending zero-shot generalization brought by instruction tuning, but little of the mechanism has been understood. Existing work has largely been confined to the task level, without considering that tasks are artificially defined and, to LLMs, merely consist of tokens and representations. This line of research has been limited to examining transfer between tasks from a task-pair perspective, with few studies focusing on understanding zero-shot generalization from the perspective of the data itself. To bridge this gap, we first demonstrate through multiple metrics that zero-shot generalization during instruction tuning happens very early. Next, we investigate the facilitation of zero-shot generalization from both data similarity and granularity perspectives, confirming that encountering highly similar and fine-grained training data earlier during instruction tuning, without the constraints of defined "tasks", enables better generalization. Finally, we propose a more grounded training data arrangement method, Test-centric Multi-turn Arrangement, and show its effectiveness in promoting continual learning and further loss reduction. For the first time, we show that zero-shot generalization during instruction tuning is a form of similarity-based generalization between training and test data at the instance level. We hope our analysis will advance the understanding of zero-shot generalization during instruction tuning and contribute to the development of more aligned LLMs. Our code is released at https://github.com/HBX-hbx/dynamics_of_zero-shot_generalization.
doi_str_mv 10.48550/arxiv.2406.11721
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2406_11721</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2406_11721</sourcerecordid><originalsourceid>FETCH-LOGICAL-a671-ba92271f138ab1e5b0f12c0e197ad39ccf020fceb000bad492b39898d372355b3</originalsourceid><addsrcrecordid>eNotj7tOwzAYRr0woMIDMOEXSPAlrmM2VEGoVImhmRBS9PvWWkoc5DiI8vTQlOnTd4YjHYTuKCmrWgjyAOk7fJWsIuuSUsnoNfp4d2ks9scx48ZFl6APP5DDGLGdU4gHvI1TTrNZUDvHP_R4ZuFwzBP2aRzwPgyhhxTyCUO0uEkQ58u_QVce-snd_u8KtS_P7ea12L01283TroC1pIUGxZiknvIaNHVCE0-ZIY4qCZYrYzxhxBunCSEabKWY5qpWteWScSE0X6H7i3bJ6z5TGCCdunNmt2TyXyIvTu8</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity</title><source>arXiv.org</source><creator>He, Bingxiang ; Ding, Ning ; Qian, Cheng ; Deng, Jia ; Cui, Ganqu ; Yuan, Lifan ; Gao, Huan-ang ; Chen, Huimin ; Liu, Zhiyuan ; Sun, Maosong</creator><creatorcontrib>He, Bingxiang ; Ding, Ning ; Qian, Cheng ; Deng, Jia ; Cui, Ganqu ; Yuan, Lifan ; Gao, Huan-ang ; Chen, Huimin ; Liu, Zhiyuan ; Sun, Maosong</creatorcontrib><description>Understanding alignment techniques begins with comprehending zero-shot generalization brought by instruction tuning, but little of the mechanism has been understood. Existing work has largely been confined to the task level, without considering that tasks are artificially defined and, to LLMs, merely consist of tokens and representations. This line of research has been limited to examining transfer between tasks from a task-pair perspective, with few studies focusing on understanding zero-shot generalization from the perspective of the data itself. To bridge this gap, we first demonstrate through multiple metrics that zero-shot generalization during instruction tuning happens very early. Next, we investigate the facilitation of zero-shot generalization from both data similarity and granularity perspectives, confirming that encountering highly similar and fine-grained training data earlier during instruction tuning, without the constraints of defined "tasks", enables better generalization. Finally, we propose a more grounded training data arrangement method, Test-centric Multi-turn Arrangement, and show its effectiveness in promoting continual learning and further loss reduction. For the first time, we show that zero-shot generalization during instruction tuning is a form of similarity-based generalization between training and test data at the instance level. We hope our analysis will advance the understanding of zero-shot generalization during instruction tuning and contribute to the development of more aligned LLMs. Our code is released at https://github.com/HBX-hbx/dynamics_of_zero-shot_generalization.</description><identifier>DOI: 10.48550/arxiv.2406.11721</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language ; Computer Science - Learning</subject><creationdate>2024-06</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2406.11721$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2406.11721$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>He, Bingxiang</creatorcontrib><creatorcontrib>Ding, Ning</creatorcontrib><creatorcontrib>Qian, Cheng</creatorcontrib><creatorcontrib>Deng, Jia</creatorcontrib><creatorcontrib>Cui, Ganqu</creatorcontrib><creatorcontrib>Yuan, Lifan</creatorcontrib><creatorcontrib>Gao, Huan-ang</creatorcontrib><creatorcontrib>Chen, Huimin</creatorcontrib><creatorcontrib>Liu, Zhiyuan</creatorcontrib><creatorcontrib>Sun, Maosong</creatorcontrib><title>Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity</title><description>Understanding alignment techniques begins with comprehending zero-shot generalization brought by instruction tuning, but little of the mechanism has been understood. Existing work has largely been confined to the task level, without considering that tasks are artificially defined and, to LLMs, merely consist of tokens and representations. This line of research has been limited to examining transfer between tasks from a task-pair perspective, with few studies focusing on understanding zero-shot generalization from the perspective of the data itself. To bridge this gap, we first demonstrate through multiple metrics that zero-shot generalization during instruction tuning happens very early. Next, we investigate the facilitation of zero-shot generalization from both data similarity and granularity perspectives, confirming that encountering highly similar and fine-grained training data earlier during instruction tuning, without the constraints of defined "tasks", enables better generalization. Finally, we propose a more grounded training data arrangement method, Test-centric Multi-turn Arrangement, and show its effectiveness in promoting continual learning and further loss reduction. For the first time, we show that zero-shot generalization during instruction tuning is a form of similarity-based generalization between training and test data at the instance level. We hope our analysis will advance the understanding of zero-shot generalization during instruction tuning and contribute to the development of more aligned LLMs. Our code is released at https://github.com/HBX-hbx/dynamics_of_zero-shot_generalization.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj7tOwzAYRr0woMIDMOEXSPAlrmM2VEGoVImhmRBS9PvWWkoc5DiI8vTQlOnTd4YjHYTuKCmrWgjyAOk7fJWsIuuSUsnoNfp4d2ks9scx48ZFl6APP5DDGLGdU4gHvI1TTrNZUDvHP_R4ZuFwzBP2aRzwPgyhhxTyCUO0uEkQ58u_QVce-snd_u8KtS_P7ea12L01283TroC1pIUGxZiknvIaNHVCE0-ZIY4qCZYrYzxhxBunCSEabKWY5qpWteWScSE0X6H7i3bJ6z5TGCCdunNmt2TyXyIvTu8</recordid><startdate>20240617</startdate><enddate>20240617</enddate><creator>He, Bingxiang</creator><creator>Ding, Ning</creator><creator>Qian, Cheng</creator><creator>Deng, Jia</creator><creator>Cui, Ganqu</creator><creator>Yuan, Lifan</creator><creator>Gao, Huan-ang</creator><creator>Chen, Huimin</creator><creator>Liu, Zhiyuan</creator><creator>Sun, Maosong</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240617</creationdate><title>Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity</title><author>He, Bingxiang ; Ding, Ning ; Qian, Cheng ; Deng, Jia ; Cui, Ganqu ; Yuan, Lifan ; Gao, Huan-ang ; Chen, Huimin ; Liu, Zhiyuan ; Sun, Maosong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a671-ba92271f138ab1e5b0f12c0e197ad39ccf020fceb000bad492b39898d372355b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>He, Bingxiang</creatorcontrib><creatorcontrib>Ding, Ning</creatorcontrib><creatorcontrib>Qian, Cheng</creatorcontrib><creatorcontrib>Deng, Jia</creatorcontrib><creatorcontrib>Cui, Ganqu</creatorcontrib><creatorcontrib>Yuan, Lifan</creatorcontrib><creatorcontrib>Gao, Huan-ang</creatorcontrib><creatorcontrib>Chen, Huimin</creatorcontrib><creatorcontrib>Liu, Zhiyuan</creatorcontrib><creatorcontrib>Sun, Maosong</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>He, Bingxiang</au><au>Ding, Ning</au><au>Qian, Cheng</au><au>Deng, Jia</au><au>Cui, Ganqu</au><au>Yuan, Lifan</au><au>Gao, Huan-ang</au><au>Chen, Huimin</au><au>Liu, Zhiyuan</au><au>Sun, Maosong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity</atitle><date>2024-06-17</date><risdate>2024</risdate><abstract>Understanding alignment techniques begins with comprehending zero-shot generalization brought by instruction tuning, but little of the mechanism has been understood. Existing work has largely been confined to the task level, without considering that tasks are artificially defined and, to LLMs, merely consist of tokens and representations. This line of research has been limited to examining transfer between tasks from a task-pair perspective, with few studies focusing on understanding zero-shot generalization from the perspective of the data itself. To bridge this gap, we first demonstrate through multiple metrics that zero-shot generalization during instruction tuning happens very early. Next, we investigate the facilitation of zero-shot generalization from both data similarity and granularity perspectives, confirming that encountering highly similar and fine-grained training data earlier during instruction tuning, without the constraints of defined "tasks", enables better generalization. Finally, we propose a more grounded training data arrangement method, Test-centric Multi-turn Arrangement, and show its effectiveness in promoting continual learning and further loss reduction. For the first time, we show that zero-shot generalization during instruction tuning is a form of similarity-based generalization between training and test data at the instance level. We hope our analysis will advance the understanding of zero-shot generalization during instruction tuning and contribute to the development of more aligned LLMs. Our code is released at https://github.com/HBX-hbx/dynamics_of_zero-shot_generalization.</abstract><doi>10.48550/arxiv.2406.11721</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2406.11721
ispartof
issn
language eng
recordid cdi_arxiv_primary_2406_11721
source arXiv.org
subjects Computer Science - Artificial Intelligence
Computer Science - Computation and Language
Computer Science - Learning
title Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T04%3A02%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Zero-Shot%20Generalization%20during%20Instruction%20Tuning:%20Insights%20from%20Similarity%20and%20Granularity&rft.au=He,%20Bingxiang&rft.date=2024-06-17&rft_id=info:doi/10.48550/arxiv.2406.11721&rft_dat=%3Carxiv_GOX%3E2406_11721%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true