Robust Imitation Learning against Variations in Environment Dynamics
In this paper, we propose a robust imitation learning (IL) framework that improves the robustness of IL when environment dynamics are perturbed. The existing IL framework trained in a single environment can catastrophically fail with perturbations in environment dynamics because it does not capture...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Chae, Jongseong Han, Seungyul Jung, Whiyoung Cho, Myungsik Choi, Sungho Sung, Youngchul |
description | In this paper, we propose a robust imitation learning (IL) framework that
improves the robustness of IL when environment dynamics are perturbed. The
existing IL framework trained in a single environment can catastrophically fail
with perturbations in environment dynamics because it does not capture the
situation that underlying environment dynamics can be changed. Our framework
effectively deals with environments with varying dynamics by imitating multiple
experts in sampled environment dynamics to enhance the robustness in general
variations in environment dynamics. In order to robustly imitate the multiple
sample experts, we minimize the risk with respect to the Jensen-Shannon
divergence between the agent's policy and each of the sample experts. Numerical
results show that our algorithm significantly improves robustness against
dynamics perturbations compared to conventional IL baselines. |
doi_str_mv | 10.48550/arxiv.2206.09314 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2206_09314</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2206_09314</sourcerecordid><originalsourceid>FETCH-LOGICAL-a674-9171c42ba7d7ddc224c9ff82d6edc8c02cd5070614ec3c2e0b719ec73f8f47d73</originalsourceid><addsrcrecordid>eNotj71qwzAURrVkKEkfoFP1Anb0Z8seQ5K2AUOhhK7m-koKF2I5yG5o3r6p2-kbDt-Bw9iTFLmpikKsIX3TNVdKlLmotTQPbPcxdF_jxA89TTDREHnjIUWKJw4noHhHn5BoRiOnyPfxSmmIvY8T390i9ITjii0CnEf_-L9LdnzZH7dvWfP-ethumgxKa7JaWolGdWCddQ6VMliHUClXeocVCoWuEFaU0njUqLzorKw9Wh2qYO4fvWTPf9o5o70k6iHd2t-cds7RP8-3RkE</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Robust Imitation Learning against Variations in Environment Dynamics</title><source>arXiv.org</source><creator>Chae, Jongseong ; Han, Seungyul ; Jung, Whiyoung ; Cho, Myungsik ; Choi, Sungho ; Sung, Youngchul</creator><creatorcontrib>Chae, Jongseong ; Han, Seungyul ; Jung, Whiyoung ; Cho, Myungsik ; Choi, Sungho ; Sung, Youngchul</creatorcontrib><description>In this paper, we propose a robust imitation learning (IL) framework that
improves the robustness of IL when environment dynamics are perturbed. The
existing IL framework trained in a single environment can catastrophically fail
with perturbations in environment dynamics because it does not capture the
situation that underlying environment dynamics can be changed. Our framework
effectively deals with environments with varying dynamics by imitating multiple
experts in sampled environment dynamics to enhance the robustness in general
variations in environment dynamics. In order to robustly imitate the multiple
sample experts, we minimize the risk with respect to the Jensen-Shannon
divergence between the agent's policy and each of the sample experts. Numerical
results show that our algorithm significantly improves robustness against
dynamics perturbations compared to conventional IL baselines.</description><identifier>DOI: 10.48550/arxiv.2206.09314</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Learning</subject><creationdate>2022-06</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2206.09314$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2206.09314$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Chae, Jongseong</creatorcontrib><creatorcontrib>Han, Seungyul</creatorcontrib><creatorcontrib>Jung, Whiyoung</creatorcontrib><creatorcontrib>Cho, Myungsik</creatorcontrib><creatorcontrib>Choi, Sungho</creatorcontrib><creatorcontrib>Sung, Youngchul</creatorcontrib><title>Robust Imitation Learning against Variations in Environment Dynamics</title><description>In this paper, we propose a robust imitation learning (IL) framework that
improves the robustness of IL when environment dynamics are perturbed. The
existing IL framework trained in a single environment can catastrophically fail
with perturbations in environment dynamics because it does not capture the
situation that underlying environment dynamics can be changed. Our framework
effectively deals with environments with varying dynamics by imitating multiple
experts in sampled environment dynamics to enhance the robustness in general
variations in environment dynamics. In order to robustly imitate the multiple
sample experts, we minimize the risk with respect to the Jensen-Shannon
divergence between the agent's policy and each of the sample experts. Numerical
results show that our algorithm significantly improves robustness against
dynamics perturbations compared to conventional IL baselines.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj71qwzAURrVkKEkfoFP1Anb0Z8seQ5K2AUOhhK7m-koKF2I5yG5o3r6p2-kbDt-Bw9iTFLmpikKsIX3TNVdKlLmotTQPbPcxdF_jxA89TTDREHnjIUWKJw4noHhHn5BoRiOnyPfxSmmIvY8T390i9ITjii0CnEf_-L9LdnzZH7dvWfP-ethumgxKa7JaWolGdWCddQ6VMliHUClXeocVCoWuEFaU0njUqLzorKw9Wh2qYO4fvWTPf9o5o70k6iHd2t-cds7RP8-3RkE</recordid><startdate>20220618</startdate><enddate>20220618</enddate><creator>Chae, Jongseong</creator><creator>Han, Seungyul</creator><creator>Jung, Whiyoung</creator><creator>Cho, Myungsik</creator><creator>Choi, Sungho</creator><creator>Sung, Youngchul</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20220618</creationdate><title>Robust Imitation Learning against Variations in Environment Dynamics</title><author>Chae, Jongseong ; Han, Seungyul ; Jung, Whiyoung ; Cho, Myungsik ; Choi, Sungho ; Sung, Youngchul</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a674-9171c42ba7d7ddc224c9ff82d6edc8c02cd5070614ec3c2e0b719ec73f8f47d73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Chae, Jongseong</creatorcontrib><creatorcontrib>Han, Seungyul</creatorcontrib><creatorcontrib>Jung, Whiyoung</creatorcontrib><creatorcontrib>Cho, Myungsik</creatorcontrib><creatorcontrib>Choi, Sungho</creatorcontrib><creatorcontrib>Sung, Youngchul</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Chae, Jongseong</au><au>Han, Seungyul</au><au>Jung, Whiyoung</au><au>Cho, Myungsik</au><au>Choi, Sungho</au><au>Sung, Youngchul</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Robust Imitation Learning against Variations in Environment Dynamics</atitle><date>2022-06-18</date><risdate>2022</risdate><abstract>In this paper, we propose a robust imitation learning (IL) framework that
improves the robustness of IL when environment dynamics are perturbed. The
existing IL framework trained in a single environment can catastrophically fail
with perturbations in environment dynamics because it does not capture the
situation that underlying environment dynamics can be changed. Our framework
effectively deals with environments with varying dynamics by imitating multiple
experts in sampled environment dynamics to enhance the robustness in general
variations in environment dynamics. In order to robustly imitate the multiple
sample experts, we minimize the risk with respect to the Jensen-Shannon
divergence between the agent's policy and each of the sample experts. Numerical
results show that our algorithm significantly improves robustness against
dynamics perturbations compared to conventional IL baselines.</abstract><doi>10.48550/arxiv.2206.09314</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2206.09314 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2206_09314 |
source | arXiv.org |
subjects | Computer Science - Artificial Intelligence Computer Science - Learning |
title | Robust Imitation Learning against Variations in Environment Dynamics |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T18%3A01%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Robust%20Imitation%20Learning%20against%20Variations%20in%20Environment%20Dynamics&rft.au=Chae,%20Jongseong&rft.date=2022-06-18&rft_id=info:doi/10.48550/arxiv.2206.09314&rft_dat=%3Carxiv_GOX%3E2206_09314%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |