Adaptive Non-Uniform Timestep Sampling for Diffusion Model Training

As a highly expressive generative model, diffusion models have demonstrated exceptional success across various domains, including image generation, natural language processing, and combinatorial optimization. However, as data distributions grow more complex, training these models to convergence beco...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Kim, Myunsoo, Ki, Donghyeon, Shim, Seong-Woong, Lee, Byung-Jun
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computer Vision and Pattern Recognition Computer Science - Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Kim, Myunsoo Ki, Donghyeon Shim, Seong-Woong Lee, Byung-Jun
description	As a highly expressive generative model, diffusion models have demonstrated exceptional success across various domains, including image generation, natural language processing, and combinatorial optimization. However, as data distributions grow more complex, training these models to convergence becomes increasingly computationally intensive. While diffusion models are typically trained using uniform timestep sampling, our research shows that the variance in stochastic gradients varies significantly across timesteps, with high-variance timesteps becoming bottlenecks that hinder faster convergence. To address this issue, we introduce a non-uniform timestep sampling method that prioritizes these more critical timesteps. Our method tracks the impact of gradient updates on the objective for each timestep, adaptively selecting those most likely to minimize the objective effectively. Experimental results demonstrate that this approach not only accelerates the training process, but also leads to improved performance at convergence. Furthermore, our method shows robust performance across various datasets, scheduling strategies, and diffusion architectures, outperforming previously proposed timestep sampling and weighting heuristics that lack this degree of robustness.
doi_str_mv	10.48550/arxiv.2411.09998
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2411_09998</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2411_09998</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2411_099983</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DOwtLS04GRwdkxJLCjJLEtV8MvP0w3Ny0zLL8pVCMnMTS0uSS1QCE7MLcjJzEtXAAoruGSmpZUWZ-bnKfjmp6TmKIQUJWbmASV5GFjTEnOKU3mhNDeDvJtriLOHLti6-IKizNzEosp4kLXxYGuNCasAAKxIN98</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Adaptive Non-Uniform Timestep Sampling for Diffusion Model Training</title><source>arXiv.org</source><creator>Kim, Myunsoo ; Ki, Donghyeon ; Shim, Seong-Woong ; Lee, Byung-Jun</creator><creatorcontrib>Kim, Myunsoo ; Ki, Donghyeon ; Shim, Seong-Woong ; Lee, Byung-Jun</creatorcontrib><description>As a highly expressive generative model, diffusion models have demonstrated exceptional success across various domains, including image generation, natural language processing, and combinatorial optimization. However, as data distributions grow more complex, training these models to convergence becomes increasingly computationally intensive. While diffusion models are typically trained using uniform timestep sampling, our research shows that the variance in stochastic gradients varies significantly across timesteps, with high-variance timesteps becoming bottlenecks that hinder faster convergence. To address this issue, we introduce a non-uniform timestep sampling method that prioritizes these more critical timesteps. Our method tracks the impact of gradient updates on the objective for each timestep, adaptively selecting those most likely to minimize the objective effectively. Experimental results demonstrate that this approach not only accelerates the training process, but also leads to improved performance at convergence. Furthermore, our method shows robust performance across various datasets, scheduling strategies, and diffusion architectures, outperforming previously proposed timestep sampling and weighting heuristics that lack this degree of robustness.</description><identifier>DOI: 10.48550/arxiv.2411.09998</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Learning</subject><creationdate>2024-11</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2411.09998$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2411.09998$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Kim, Myunsoo</creatorcontrib><creatorcontrib>Ki, Donghyeon</creatorcontrib><creatorcontrib>Shim, Seong-Woong</creatorcontrib><creatorcontrib>Lee, Byung-Jun</creatorcontrib><title>Adaptive Non-Uniform Timestep Sampling for Diffusion Model Training</title><description>As a highly expressive generative model, diffusion models have demonstrated exceptional success across various domains, including image generation, natural language processing, and combinatorial optimization. However, as data distributions grow more complex, training these models to convergence becomes increasingly computationally intensive. While diffusion models are typically trained using uniform timestep sampling, our research shows that the variance in stochastic gradients varies significantly across timesteps, with high-variance timesteps becoming bottlenecks that hinder faster convergence. To address this issue, we introduce a non-uniform timestep sampling method that prioritizes these more critical timesteps. Our method tracks the impact of gradient updates on the objective for each timestep, adaptively selecting those most likely to minimize the objective effectively. Experimental results demonstrate that this approach not only accelerates the training process, but also leads to improved performance at convergence. Furthermore, our method shows robust performance across various datasets, scheduling strategies, and diffusion architectures, outperforming previously proposed timestep sampling and weighting heuristics that lack this degree of robustness.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DOwtLS04GRwdkxJLCjJLEtV8MvP0w3Ny0zLL8pVCMnMTS0uSS1QCE7MLcjJzEtXAAoruGSmpZUWZ-bnKfjmp6TmKIQUJWbmASV5GFjTEnOKU3mhNDeDvJtriLOHLti6-IKizNzEosp4kLXxYGuNCasAAKxIN98</recordid><startdate>20241115</startdate><enddate>20241115</enddate><creator>Kim, Myunsoo</creator><creator>Ki, Donghyeon</creator><creator>Shim, Seong-Woong</creator><creator>Lee, Byung-Jun</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241115</creationdate><title>Adaptive Non-Uniform Timestep Sampling for Diffusion Model Training</title><author>Kim, Myunsoo ; Ki, Donghyeon ; Shim, Seong-Woong ; Lee, Byung-Jun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2411_099983</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Kim, Myunsoo</creatorcontrib><creatorcontrib>Ki, Donghyeon</creatorcontrib><creatorcontrib>Shim, Seong-Woong</creatorcontrib><creatorcontrib>Lee, Byung-Jun</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kim, Myunsoo</au><au>Ki, Donghyeon</au><au>Shim, Seong-Woong</au><au>Lee, Byung-Jun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Adaptive Non-Uniform Timestep Sampling for Diffusion Model Training</atitle><date>2024-11-15</date><risdate>2024</risdate><abstract>As a highly expressive generative model, diffusion models have demonstrated exceptional success across various domains, including image generation, natural language processing, and combinatorial optimization. However, as data distributions grow more complex, training these models to convergence becomes increasingly computationally intensive. While diffusion models are typically trained using uniform timestep sampling, our research shows that the variance in stochastic gradients varies significantly across timesteps, with high-variance timesteps becoming bottlenecks that hinder faster convergence. To address this issue, we introduce a non-uniform timestep sampling method that prioritizes these more critical timesteps. Our method tracks the impact of gradient updates on the objective for each timestep, adaptively selecting those most likely to minimize the objective effectively. Experimental results demonstrate that this approach not only accelerates the training process, but also leads to improved performance at convergence. Furthermore, our method shows robust performance across various datasets, scheduling strategies, and diffusion architectures, outperforming previously proposed timestep sampling and weighting heuristics that lack this degree of robustness.</abstract><doi>10.48550/arxiv.2411.09998</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2411.09998
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2411_09998
source	arXiv.org
subjects	Computer Science - Computer Vision and Pattern Recognition Computer Science - Learning
title	Adaptive Non-Uniform Timestep Sampling for Diffusion Model Training
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T21%3A12%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Adaptive%20Non-Uniform%20Timestep%20Sampling%20for%20Diffusion%20Model%20Training&rft.au=Kim,%20Myunsoo&rft.date=2024-11-15&rft_id=info:doi/10.48550/arxiv.2411.09998&rft_dat=%3Carxiv_GOX%3E2411_09998%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true