Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions

Deep generative models (DGMs) have demonstrated great success across various domains, particularly in generating texts, images, and videos using models trained from offline data. Similarly, data-driven decision-making and robotic control also necessitate learning a generator function from the offlin...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Chen, Jiayu, Ganguly, Bhargav, Xu, Yang, Mei, Yongsheng, Lan, Tian, Aggarwal, Vaneet
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Chen, Jiayu
Ganguly, Bhargav
Xu, Yang
Mei, Yongsheng
Lan, Tian
Aggarwal, Vaneet
description Deep generative models (DGMs) have demonstrated great success across various domains, particularly in generating texts, images, and videos using models trained from offline data. Similarly, data-driven decision-making and robotic control also necessitate learning a generator function from the offline data to serve as the strategy or policy. In this case, applying deep generative models in offline policy learning exhibits great potential, and numerous studies have explored in this direction. However, this field still lacks a comprehensive review and so developments of different branches are relatively independent. In this paper, we provide the first systematic review on the applications of deep generative models for offline policy learning. In particular, we cover five mainstream deep generative models, including Variational Auto-Encoders, Generative Adversarial Networks, Normalizing Flows, Transformers, and Diffusion Models, and their applications in both offline reinforcement learning (offline RL) and imitation learning (IL). Offline RL and IL are two main branches of offline policy learning and are widely-adopted techniques for sequential decision-making. Notably, for each type of DGM-based offline policy learning, we distill its fundamental scheme, categorize related works based on the usage of the DGM, and sort out the development process of algorithms in that field. Subsequent to the main content, we provide in-depth discussions on deep generative models and offline policy learning as a summary, based on which we present our perspectives on future research directions. This work offers a hands-on reference for the research progress in deep generative models for offline policy learning, and aims to inspire improved DGM-based offline RL or IL algorithms. For convenience, we maintain a paper list on https://github.com/LucasCJYSDL/DGMs-for-Offline-Policy-Learning.
doi_str_mv 10.48550/arxiv.2402.13777
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2402_13777</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2402_13777</sourcerecordid><originalsourceid>FETCH-LOGICAL-a677-4b251bbbe362e631823cfe627ecadf0975ed1317a630a40fc5a5305398e58fc83</originalsourceid><addsrcrecordid>eNotz0tOwzAUhWFPGKDCAhhxF9AEP-I4ZYZaWpCCWonMI8e5RhbBjpyHyO5RS0dHOoNf-gh5YDTNCinpk46_bk55RnnKhFLqlnzvEHs4oMeoRzcjfIQWuwFsiHC0tnMe4RQ6ZxYoUUfv_NczVNMYotPdGj6nOOOyBu1bOGEcejTnygDBw34ap4iwc_F8Bj_ckRuruwHvr7si1f612r4l5fHwvn0pE50rlWQNl6xpGhQ5x1ywggtjMecKjW4t3SiJLRNM6VxQnVFrpJaCSrEpUBbWFGJFHv-zF23dR_ej41Kf1fVFLf4As5ZTNw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions</title><source>arXiv.org</source><creator>Chen, Jiayu ; Ganguly, Bhargav ; Xu, Yang ; Mei, Yongsheng ; Lan, Tian ; Aggarwal, Vaneet</creator><creatorcontrib>Chen, Jiayu ; Ganguly, Bhargav ; Xu, Yang ; Mei, Yongsheng ; Lan, Tian ; Aggarwal, Vaneet</creatorcontrib><description>Deep generative models (DGMs) have demonstrated great success across various domains, particularly in generating texts, images, and videos using models trained from offline data. Similarly, data-driven decision-making and robotic control also necessitate learning a generator function from the offline data to serve as the strategy or policy. In this case, applying deep generative models in offline policy learning exhibits great potential, and numerous studies have explored in this direction. However, this field still lacks a comprehensive review and so developments of different branches are relatively independent. In this paper, we provide the first systematic review on the applications of deep generative models for offline policy learning. In particular, we cover five mainstream deep generative models, including Variational Auto-Encoders, Generative Adversarial Networks, Normalizing Flows, Transformers, and Diffusion Models, and their applications in both offline reinforcement learning (offline RL) and imitation learning (IL). Offline RL and IL are two main branches of offline policy learning and are widely-adopted techniques for sequential decision-making. Notably, for each type of DGM-based offline policy learning, we distill its fundamental scheme, categorize related works based on the usage of the DGM, and sort out the development process of algorithms in that field. Subsequent to the main content, we provide in-depth discussions on deep generative models and offline policy learning as a summary, based on which we present our perspectives on future research directions. This work offers a hands-on reference for the research progress in deep generative models for offline policy learning, and aims to inspire improved DGM-based offline RL or IL algorithms. For convenience, we maintain a paper list on https://github.com/LucasCJYSDL/DGMs-for-Offline-Policy-Learning.</description><identifier>DOI: 10.48550/arxiv.2402.13777</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Learning</subject><creationdate>2024-02</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2402.13777$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2402.13777$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Chen, Jiayu</creatorcontrib><creatorcontrib>Ganguly, Bhargav</creatorcontrib><creatorcontrib>Xu, Yang</creatorcontrib><creatorcontrib>Mei, Yongsheng</creatorcontrib><creatorcontrib>Lan, Tian</creatorcontrib><creatorcontrib>Aggarwal, Vaneet</creatorcontrib><title>Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions</title><description>Deep generative models (DGMs) have demonstrated great success across various domains, particularly in generating texts, images, and videos using models trained from offline data. Similarly, data-driven decision-making and robotic control also necessitate learning a generator function from the offline data to serve as the strategy or policy. In this case, applying deep generative models in offline policy learning exhibits great potential, and numerous studies have explored in this direction. However, this field still lacks a comprehensive review and so developments of different branches are relatively independent. In this paper, we provide the first systematic review on the applications of deep generative models for offline policy learning. In particular, we cover five mainstream deep generative models, including Variational Auto-Encoders, Generative Adversarial Networks, Normalizing Flows, Transformers, and Diffusion Models, and their applications in both offline reinforcement learning (offline RL) and imitation learning (IL). Offline RL and IL are two main branches of offline policy learning and are widely-adopted techniques for sequential decision-making. Notably, for each type of DGM-based offline policy learning, we distill its fundamental scheme, categorize related works based on the usage of the DGM, and sort out the development process of algorithms in that field. Subsequent to the main content, we provide in-depth discussions on deep generative models and offline policy learning as a summary, based on which we present our perspectives on future research directions. This work offers a hands-on reference for the research progress in deep generative models for offline policy learning, and aims to inspire improved DGM-based offline RL or IL algorithms. For convenience, we maintain a paper list on https://github.com/LucasCJYSDL/DGMs-for-Offline-Policy-Learning.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz0tOwzAUhWFPGKDCAhhxF9AEP-I4ZYZaWpCCWonMI8e5RhbBjpyHyO5RS0dHOoNf-gh5YDTNCinpk46_bk55RnnKhFLqlnzvEHs4oMeoRzcjfIQWuwFsiHC0tnMe4RQ6ZxYoUUfv_NczVNMYotPdGj6nOOOyBu1bOGEcejTnygDBw34ap4iwc_F8Bj_ckRuruwHvr7si1f612r4l5fHwvn0pE50rlWQNl6xpGhQ5x1ywggtjMecKjW4t3SiJLRNM6VxQnVFrpJaCSrEpUBbWFGJFHv-zF23dR_ej41Kf1fVFLf4As5ZTNw</recordid><startdate>20240221</startdate><enddate>20240221</enddate><creator>Chen, Jiayu</creator><creator>Ganguly, Bhargav</creator><creator>Xu, Yang</creator><creator>Mei, Yongsheng</creator><creator>Lan, Tian</creator><creator>Aggarwal, Vaneet</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240221</creationdate><title>Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions</title><author>Chen, Jiayu ; Ganguly, Bhargav ; Xu, Yang ; Mei, Yongsheng ; Lan, Tian ; Aggarwal, Vaneet</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a677-4b251bbbe362e631823cfe627ecadf0975ed1317a630a40fc5a5305398e58fc83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Chen, Jiayu</creatorcontrib><creatorcontrib>Ganguly, Bhargav</creatorcontrib><creatorcontrib>Xu, Yang</creatorcontrib><creatorcontrib>Mei, Yongsheng</creatorcontrib><creatorcontrib>Lan, Tian</creatorcontrib><creatorcontrib>Aggarwal, Vaneet</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Chen, Jiayu</au><au>Ganguly, Bhargav</au><au>Xu, Yang</au><au>Mei, Yongsheng</au><au>Lan, Tian</au><au>Aggarwal, Vaneet</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions</atitle><date>2024-02-21</date><risdate>2024</risdate><abstract>Deep generative models (DGMs) have demonstrated great success across various domains, particularly in generating texts, images, and videos using models trained from offline data. Similarly, data-driven decision-making and robotic control also necessitate learning a generator function from the offline data to serve as the strategy or policy. In this case, applying deep generative models in offline policy learning exhibits great potential, and numerous studies have explored in this direction. However, this field still lacks a comprehensive review and so developments of different branches are relatively independent. In this paper, we provide the first systematic review on the applications of deep generative models for offline policy learning. In particular, we cover five mainstream deep generative models, including Variational Auto-Encoders, Generative Adversarial Networks, Normalizing Flows, Transformers, and Diffusion Models, and their applications in both offline reinforcement learning (offline RL) and imitation learning (IL). Offline RL and IL are two main branches of offline policy learning and are widely-adopted techniques for sequential decision-making. Notably, for each type of DGM-based offline policy learning, we distill its fundamental scheme, categorize related works based on the usage of the DGM, and sort out the development process of algorithms in that field. Subsequent to the main content, we provide in-depth discussions on deep generative models and offline policy learning as a summary, based on which we present our perspectives on future research directions. This work offers a hands-on reference for the research progress in deep generative models for offline policy learning, and aims to inspire improved DGM-based offline RL or IL algorithms. For convenience, we maintain a paper list on https://github.com/LucasCJYSDL/DGMs-for-Offline-Policy-Learning.</abstract><doi>10.48550/arxiv.2402.13777</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2402.13777
ispartof
issn
language eng
recordid cdi_arxiv_primary_2402_13777
source arXiv.org
subjects Computer Science - Artificial Intelligence
Computer Science - Learning
title Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T10%3A07%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep%20Generative%20Models%20for%20Offline%20Policy%20Learning:%20Tutorial,%20Survey,%20and%20Perspectives%20on%20Future%20Directions&rft.au=Chen,%20Jiayu&rft.date=2024-02-21&rft_id=info:doi/10.48550/arxiv.2402.13777&rft_dat=%3Carxiv_GOX%3E2402_13777%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true