Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity

Understanding alignment techniques begins with comprehending zero-shot generalization brought by instruction tuning, but little of the mechanism has been understood. Existing work has largely been confined to the task level, without considering that tasks are artificially defined and, to LLMs, merel...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	He, Bingxiang, Ding, Ning, Qian, Cheng, Deng, Jia, Cui, Ganqu, Yuan, Lifan, Gao, Huan-ang, Chen, Huimin, Liu, Zhiyuan, Sun, Maosong
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Computation and Language Computer Science - Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	He, Bingxiang Ding, Ning Qian, Cheng Deng, Jia Cui, Ganqu Yuan, Lifan Gao, Huan-ang Chen, Huimin Liu, Zhiyuan Sun, Maosong
description	Understanding alignment techniques begins with comprehending zero-shot generalization brought by instruction tuning, but little of the mechanism has been understood. Existing work has largely been confined to the task level, without considering that tasks are artificially defined and, to LLMs, merely consist of tokens and representations. This line of research has been limited to examining transfer between tasks from a task-pair perspective, with few studies focusing on understanding zero-shot generalization from the perspective of the data itself. To bridge this gap, we first demonstrate through multiple metrics that zero-shot generalization during instruction tuning happens very early. Next, we investigate the facilitation of zero-shot generalization from both data similarity and granularity perspectives, confirming that encountering highly similar and fine-grained training data earlier during instruction tuning, without the constraints of defined "tasks", enables better generalization. Finally, we propose a more grounded training data arrangement method, Test-centric Multi-turn Arrangement, and show its effectiveness in promoting continual learning and further loss reduction. For the first time, we show that zero-shot generalization during instruction tuning is a form of similarity-based generalization between training and test data at the instance level. We hope our analysis will advance the understanding of zero-shot generalization during instruction tuning and contribute to the development of more aligned LLMs. Our code is released at https://github.com/HBX-hbx/dynamics_of_zero-shot_generalization.
doi_str_mv	10.48550/arxiv.2406.11721
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2406_11721</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2406_11721</sourcerecordid><originalsourceid>FETCH-LOGICAL-a671-ba92271f138ab1e5b0f12c0e197ad39ccf020fceb000bad492b39898d372355b3</originalsourceid><addsrcrecordid>eNotj7tOwzAYRr0woMIDMOEXSPAlrmM2VEGoVImhmRBS9PvWWkoc5DiI8vTQlOnTd4YjHYTuKCmrWgjyAOk7fJWsIuuSUsnoNfp4d2ks9scx48ZFl6APP5DDGLGdU4gHvI1TTrNZUDvHP_R4ZuFwzBP2aRzwPgyhhxTyCUO0uEkQ58u_QVce-snd_u8KtS_P7ea12L01283TroC1pIUGxZiknvIaNHVCE0-ZIY4qCZYrYzxhxBunCSEabKWY5qpWteWScSE0X6H7i3bJ6z5TGCCdunNmt2TyXyIvTu8</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity</title><source>arXiv.org</source><creator>He, Bingxiang ; Ding, Ning ; Qian, Cheng ; Deng, Jia ; Cui, Ganqu ; Yuan, Lifan ; Gao, Huan-ang ; Chen, Huimin ; Liu, Zhiyuan ; Sun, Maosong</creator><creatorcontrib>He, Bingxiang ; Ding, Ning ; Qian, Cheng ; Deng, Jia ; Cui, Ganqu ; Yuan, Lifan ; Gao, Huan-ang ; Chen, Huimin ; Liu, Zhiyuan ; Sun, Maosong</creatorcontrib><description>Understanding alignment techniques begins with comprehending zero-shot generalization brought by instruction tuning, but little of the mechanism has been understood. Existing work has largely been confined to the task level, without considering that tasks are artificially defined and, to LLMs, merely consist of tokens and representations. This line of research has been limited to examining transfer between tasks from a task-pair perspective, with few studies focusing on understanding zero-shot generalization from the perspective of the data itself. To bridge this gap, we first demonstrate through multiple metrics that zero-shot generalization during instruction tuning happens very early. Next, we investigate the facilitation of zero-shot generalization from both data similarity and granularity perspectives, confirming that encountering highly similar and fine-grained training data earlier during instruction tuning, without the constraints of defined "tasks", enables better generalization. Finally, we propose a more grounded training data arrangement method, Test-centric Multi-turn Arrangement, and show its effectiveness in promoting continual learning and further loss reduction. For the first time, we show that zero-shot generalization during instruction tuning is a form of similarity-based generalization between training and test data at the instance level. We hope our analysis will advance the understanding of zero-shot generalization during instruction tuning and contribute to the development of more aligned LLMs. Our code is released at https://github.com/HBX-hbx/dynamics_of_zero-shot_generalization.</description><identifier>DOI: 10.48550/arxiv.2406.11721</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language ; Computer Science - Learning</subject><creationdate>2024-06</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2406.11721$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2406.11721$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>He, Bingxiang</creatorcontrib><creatorcontrib>Ding, Ning</creatorcontrib><creatorcontrib>Qian, Cheng</creatorcontrib><creatorcontrib>Deng, Jia</creatorcontrib><creatorcontrib>Cui, Ganqu</creatorcontrib><creatorcontrib>Yuan, Lifan</creatorcontrib><creatorcontrib>Gao, Huan-ang</creatorcontrib><creatorcontrib>Chen, Huimin</creatorcontrib><creatorcontrib>Liu, Zhiyuan</creatorcontrib><creatorcontrib>Sun, Maosong</creatorcontrib><title>Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity</title><description>Understanding alignment techniques begins with comprehending zero-shot generalization brought by instruction tuning, but little of the mechanism has been understood. Existing work has largely been confined to the task level, without considering that tasks are artificially defined and, to LLMs, merely consist of tokens and representations. This line of research has been limited to examining transfer between tasks from a task-pair perspective, with few studies focusing on understanding zero-shot generalization from the perspective of the data itself. To bridge this gap, we first demonstrate through multiple metrics that zero-shot generalization during instruction tuning happens very early. Next, we investigate the facilitation of zero-shot generalization from both data similarity and granularity perspectives, confirming that encountering highly similar and fine-grained training data earlier during instruction tuning, without the constraints of defined "tasks", enables better generalization. Finally, we propose a more grounded training data arrangement method, Test-centric Multi-turn Arrangement, and show its effectiveness in promoting continual learning and further loss reduction. For the first time, we show that zero-shot generalization during instruction tuning is a form of similarity-based generalization between training and test data at the instance level. We hope our analysis will advance the understanding of zero-shot generalization during instruction tuning and contribute to the development of more aligned LLMs. Our code is released at https://github.com/HBX-hbx/dynamics_of_zero-shot_generalization.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj7tOwzAYRr0woMIDMOEXSPAlrmM2VEGoVImhmRBS9PvWWkoc5DiI8vTQlOnTd4YjHYTuKCmrWgjyAOk7fJWsIuuSUsnoNfp4d2ks9scx48ZFl6APP5DDGLGdU4gHvI1TTrNZUDvHP_R4ZuFwzBP2aRzwPgyhhxTyCUO0uEkQ58u_QVce-snd_u8KtS_P7ea12L01283TroC1pIUGxZiknvIaNHVCE0-ZIY4qCZYrYzxhxBunCSEabKWY5qpWteWScSE0X6H7i3bJ6z5TGCCdunNmt2TyXyIvTu8</recordid><startdate>20240617</startdate><enddate>20240617</enddate><creator>He, Bingxiang</creator><creator>Ding, Ning</creator><creator>Qian, Cheng</creator><creator>Deng, Jia</creator><creator>Cui, Ganqu</creator><creator>Yuan, Lifan</creator><creator>Gao, Huan-ang</creator><creator>Chen, Huimin</creator><creator>Liu, Zhiyuan</creator><creator>Sun, Maosong</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240617</creationdate><title>Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity</title><author>He, Bingxiang ; Ding, Ning ; Qian, Cheng ; Deng, Jia ; Cui, Ganqu ; Yuan, Lifan ; Gao, Huan-ang ; Chen, Huimin ; Liu, Zhiyuan ; Sun, Maosong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a671-ba92271f138ab1e5b0f12c0e197ad39ccf020fceb000bad492b39898d372355b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>He, Bingxiang</creatorcontrib><creatorcontrib>Ding, Ning</creatorcontrib><creatorcontrib>Qian, Cheng</creatorcontrib><creatorcontrib>Deng, Jia</creatorcontrib><creatorcontrib>Cui, Ganqu</creatorcontrib><creatorcontrib>Yuan, Lifan</creatorcontrib><creatorcontrib>Gao, Huan-ang</creatorcontrib><creatorcontrib>Chen, Huimin</creatorcontrib><creatorcontrib>Liu, Zhiyuan</creatorcontrib><creatorcontrib>Sun, Maosong</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>He, Bingxiang</au><au>Ding, Ning</au><au>Qian, Cheng</au><au>Deng, Jia</au><au>Cui, Ganqu</au><au>Yuan, Lifan</au><au>Gao, Huan-ang</au><au>Chen, Huimin</au><au>Liu, Zhiyuan</au><au>Sun, Maosong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity</atitle><date>2024-06-17</date><risdate>2024</risdate><abstract>Understanding alignment techniques begins with comprehending zero-shot generalization brought by instruction tuning, but little of the mechanism has been understood. Existing work has largely been confined to the task level, without considering that tasks are artificially defined and, to LLMs, merely consist of tokens and representations. This line of research has been limited to examining transfer between tasks from a task-pair perspective, with few studies focusing on understanding zero-shot generalization from the perspective of the data itself. To bridge this gap, we first demonstrate through multiple metrics that zero-shot generalization during instruction tuning happens very early. Next, we investigate the facilitation of zero-shot generalization from both data similarity and granularity perspectives, confirming that encountering highly similar and fine-grained training data earlier during instruction tuning, without the constraints of defined "tasks", enables better generalization. Finally, we propose a more grounded training data arrangement method, Test-centric Multi-turn Arrangement, and show its effectiveness in promoting continual learning and further loss reduction. For the first time, we show that zero-shot generalization during instruction tuning is a form of similarity-based generalization between training and test data at the instance level. We hope our analysis will advance the understanding of zero-shot generalization during instruction tuning and contribute to the development of more aligned LLMs. Our code is released at https://github.com/HBX-hbx/dynamics_of_zero-shot_generalization.</abstract><doi>10.48550/arxiv.2406.11721</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2406.11721
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2406_11721
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Computation and Language Computer Science - Learning
title	Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T04%3A02%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Zero-Shot%20Generalization%20during%20Instruction%20Tuning:%20Insights%20from%20Similarity%20and%20Granularity&rft.au=He,%20Bingxiang&rft.date=2024-06-17&rft_id=info:doi/10.48550/arxiv.2406.11721&rft_dat=%3Carxiv_GOX%3E2406_11721%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true