Adaptive Non-Uniform Timestep Sampling for Diffusion Model Training
As a highly expressive generative model, diffusion models have demonstrated exceptional success across various domains, including image generation, natural language processing, and combinatorial optimization. However, as data distributions grow more complex, training these models to convergence beco...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Kim, Myunsoo Ki, Donghyeon Shim, Seong-Woong Lee, Byung-Jun |
description | As a highly expressive generative model, diffusion models have demonstrated
exceptional success across various domains, including image generation, natural
language processing, and combinatorial optimization. However, as data
distributions grow more complex, training these models to convergence becomes
increasingly computationally intensive. While diffusion models are typically
trained using uniform timestep sampling, our research shows that the variance
in stochastic gradients varies significantly across timesteps, with
high-variance timesteps becoming bottlenecks that hinder faster convergence. To
address this issue, we introduce a non-uniform timestep sampling method that
prioritizes these more critical timesteps. Our method tracks the impact of
gradient updates on the objective for each timestep, adaptively selecting those
most likely to minimize the objective effectively. Experimental results
demonstrate that this approach not only accelerates the training process, but
also leads to improved performance at convergence. Furthermore, our method
shows robust performance across various datasets, scheduling strategies, and
diffusion architectures, outperforming previously proposed timestep sampling
and weighting heuristics that lack this degree of robustness. |
doi_str_mv | 10.48550/arxiv.2411.09998 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2411_09998</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2411_09998</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2411_099983</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DOwtLS04GRwdkxJLCjJLEtV8MvP0w3Ny0zLL8pVCMnMTS0uSS1QCE7MLcjJzEtXAAoruGSmpZUWZ-bnKfjmp6TmKIQUJWbmASV5GFjTEnOKU3mhNDeDvJtriLOHLti6-IKizNzEosp4kLXxYGuNCasAAKxIN98</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Adaptive Non-Uniform Timestep Sampling for Diffusion Model Training</title><source>arXiv.org</source><creator>Kim, Myunsoo ; Ki, Donghyeon ; Shim, Seong-Woong ; Lee, Byung-Jun</creator><creatorcontrib>Kim, Myunsoo ; Ki, Donghyeon ; Shim, Seong-Woong ; Lee, Byung-Jun</creatorcontrib><description>As a highly expressive generative model, diffusion models have demonstrated
exceptional success across various domains, including image generation, natural
language processing, and combinatorial optimization. However, as data
distributions grow more complex, training these models to convergence becomes
increasingly computationally intensive. While diffusion models are typically
trained using uniform timestep sampling, our research shows that the variance
in stochastic gradients varies significantly across timesteps, with
high-variance timesteps becoming bottlenecks that hinder faster convergence. To
address this issue, we introduce a non-uniform timestep sampling method that
prioritizes these more critical timesteps. Our method tracks the impact of
gradient updates on the objective for each timestep, adaptively selecting those
most likely to minimize the objective effectively. Experimental results
demonstrate that this approach not only accelerates the training process, but
also leads to improved performance at convergence. Furthermore, our method
shows robust performance across various datasets, scheduling strategies, and
diffusion architectures, outperforming previously proposed timestep sampling
and weighting heuristics that lack this degree of robustness.</description><identifier>DOI: 10.48550/arxiv.2411.09998</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Learning</subject><creationdate>2024-11</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2411.09998$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2411.09998$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Kim, Myunsoo</creatorcontrib><creatorcontrib>Ki, Donghyeon</creatorcontrib><creatorcontrib>Shim, Seong-Woong</creatorcontrib><creatorcontrib>Lee, Byung-Jun</creatorcontrib><title>Adaptive Non-Uniform Timestep Sampling for Diffusion Model Training</title><description>As a highly expressive generative model, diffusion models have demonstrated
exceptional success across various domains, including image generation, natural
language processing, and combinatorial optimization. However, as data
distributions grow more complex, training these models to convergence becomes
increasingly computationally intensive. While diffusion models are typically
trained using uniform timestep sampling, our research shows that the variance
in stochastic gradients varies significantly across timesteps, with
high-variance timesteps becoming bottlenecks that hinder faster convergence. To
address this issue, we introduce a non-uniform timestep sampling method that
prioritizes these more critical timesteps. Our method tracks the impact of
gradient updates on the objective for each timestep, adaptively selecting those
most likely to minimize the objective effectively. Experimental results
demonstrate that this approach not only accelerates the training process, but
also leads to improved performance at convergence. Furthermore, our method
shows robust performance across various datasets, scheduling strategies, and
diffusion architectures, outperforming previously proposed timestep sampling
and weighting heuristics that lack this degree of robustness.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DOwtLS04GRwdkxJLCjJLEtV8MvP0w3Ny0zLL8pVCMnMTS0uSS1QCE7MLcjJzEtXAAoruGSmpZUWZ-bnKfjmp6TmKIQUJWbmASV5GFjTEnOKU3mhNDeDvJtriLOHLti6-IKizNzEosp4kLXxYGuNCasAAKxIN98</recordid><startdate>20241115</startdate><enddate>20241115</enddate><creator>Kim, Myunsoo</creator><creator>Ki, Donghyeon</creator><creator>Shim, Seong-Woong</creator><creator>Lee, Byung-Jun</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241115</creationdate><title>Adaptive Non-Uniform Timestep Sampling for Diffusion Model Training</title><author>Kim, Myunsoo ; Ki, Donghyeon ; Shim, Seong-Woong ; Lee, Byung-Jun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2411_099983</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Kim, Myunsoo</creatorcontrib><creatorcontrib>Ki, Donghyeon</creatorcontrib><creatorcontrib>Shim, Seong-Woong</creatorcontrib><creatorcontrib>Lee, Byung-Jun</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kim, Myunsoo</au><au>Ki, Donghyeon</au><au>Shim, Seong-Woong</au><au>Lee, Byung-Jun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Adaptive Non-Uniform Timestep Sampling for Diffusion Model Training</atitle><date>2024-11-15</date><risdate>2024</risdate><abstract>As a highly expressive generative model, diffusion models have demonstrated
exceptional success across various domains, including image generation, natural
language processing, and combinatorial optimization. However, as data
distributions grow more complex, training these models to convergence becomes
increasingly computationally intensive. While diffusion models are typically
trained using uniform timestep sampling, our research shows that the variance
in stochastic gradients varies significantly across timesteps, with
high-variance timesteps becoming bottlenecks that hinder faster convergence. To
address this issue, we introduce a non-uniform timestep sampling method that
prioritizes these more critical timesteps. Our method tracks the impact of
gradient updates on the objective for each timestep, adaptively selecting those
most likely to minimize the objective effectively. Experimental results
demonstrate that this approach not only accelerates the training process, but
also leads to improved performance at convergence. Furthermore, our method
shows robust performance across various datasets, scheduling strategies, and
diffusion architectures, outperforming previously proposed timestep sampling
and weighting heuristics that lack this degree of robustness.</abstract><doi>10.48550/arxiv.2411.09998</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2411.09998 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2411_09998 |
source | arXiv.org |
subjects | Computer Science - Computer Vision and Pattern Recognition Computer Science - Learning |
title | Adaptive Non-Uniform Timestep Sampling for Diffusion Model Training |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T21%3A12%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Adaptive%20Non-Uniform%20Timestep%20Sampling%20for%20Diffusion%20Model%20Training&rft.au=Kim,%20Myunsoo&rft.date=2024-11-15&rft_id=info:doi/10.48550/arxiv.2411.09998&rft_dat=%3Carxiv_GOX%3E2411_09998%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |