DRIBO: Robust Deep Reinforcement Learning via Multi-View Information Bottleneck

Deep reinforcement learning (DRL) agents are often sensitive to visual changes that were unseen in their training environments. To address this problem, we leverage the sequential nature of RL to learn robust representations that encode only task-relevant information from observations based on the u...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2022-07
Hauptverfasser:	Fan, Jiameng, Li, Wenchao
Format:	Artikel
Sprache:	eng
Schlagworte:	Control tasks Deep learning Representations Robustness Visual control Visual tasks
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Fan, Jiameng Li, Wenchao
description	Deep reinforcement learning (DRL) agents are often sensitive to visual changes that were unseen in their training environments. To address this problem, we leverage the sequential nature of RL to learn robust representations that encode only task-relevant information from observations based on the unsupervised multi-view setting. Specifically, we introduce a novel contrastive version of the Multi-View Information Bottleneck (MIB) objective for temporal data. We train RL agents from pixels with this auxiliary objective to learn robust representations that can compress away task-irrelevant information and are predictive of task-relevant dynamics. This approach enables us to train high-performance policies that are robust to visual distractions and can generalize well to unseen environments. We demonstrate that our approach can achieve SOTA performance on a diverse set of visual control tasks in the DeepMind Control Suite when the background is replaced with natural videos. In addition, we show that our approach outperforms well-established baselines for generalization to unseen environments on the Procgen benchmark. Our code is open-sourced and available at https://github. com/BU-DEPEND-Lab/DRIBO.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2494717789</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2494717789</sourcerecordid><originalsourceid>FETCH-proquest_journals_24947177893</originalsourceid><addsrcrecordid>eNqNyt0KgjAYgOERBEl5D4OOBd20aYdmUVAIEp3Kks-Y6Wb7qduvoAvo6D14nwnyCKVRkMaEzJBvTBeGIVkxkiTUQ2VRHfJyjSt1dcbiAmDEFQjZKt3AANLiI3Athbzhp-D45HorgouAFz58zcCtUBLnytoeJDT3BZq2vDfg_zpHy932vNkHo1YPB8bWnXJaflZN4ixmEWNpRv9Tb6UiPhQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2494717789</pqid></control><display><type>article</type><title>DRIBO: Robust Deep Reinforcement Learning via Multi-View Information Bottleneck</title><source>Free E- Journals</source><creator>Fan, Jiameng ; Li, Wenchao</creator><creatorcontrib>Fan, Jiameng ; Li, Wenchao</creatorcontrib><description>Deep reinforcement learning (DRL) agents are often sensitive to visual changes that were unseen in their training environments. To address this problem, we leverage the sequential nature of RL to learn robust representations that encode only task-relevant information from observations based on the unsupervised multi-view setting. Specifically, we introduce a novel contrastive version of the Multi-View Information Bottleneck (MIB) objective for temporal data. We train RL agents from pixels with this auxiliary objective to learn robust representations that can compress away task-irrelevant information and are predictive of task-relevant dynamics. This approach enables us to train high-performance policies that are robust to visual distractions and can generalize well to unseen environments. We demonstrate that our approach can achieve SOTA performance on a diverse set of visual control tasks in the DeepMind Control Suite when the background is replaced with natural videos. In addition, we show that our approach outperforms well-established baselines for generalization to unseen environments on the Procgen benchmark. Our code is open-sourced and available at https://github. com/BU-DEPEND-Lab/DRIBO.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Control tasks ; Deep learning ; Representations ; Robustness ; Visual control ; Visual tasks</subject><ispartof>arXiv.org, 2022-07</ispartof><rights>2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Fan, Jiameng</creatorcontrib><creatorcontrib>Li, Wenchao</creatorcontrib><title>DRIBO: Robust Deep Reinforcement Learning via Multi-View Information Bottleneck</title><title>arXiv.org</title><description>Deep reinforcement learning (DRL) agents are often sensitive to visual changes that were unseen in their training environments. To address this problem, we leverage the sequential nature of RL to learn robust representations that encode only task-relevant information from observations based on the unsupervised multi-view setting. Specifically, we introduce a novel contrastive version of the Multi-View Information Bottleneck (MIB) objective for temporal data. We train RL agents from pixels with this auxiliary objective to learn robust representations that can compress away task-irrelevant information and are predictive of task-relevant dynamics. This approach enables us to train high-performance policies that are robust to visual distractions and can generalize well to unseen environments. We demonstrate that our approach can achieve SOTA performance on a diverse set of visual control tasks in the DeepMind Control Suite when the background is replaced with natural videos. In addition, we show that our approach outperforms well-established baselines for generalization to unseen environments on the Procgen benchmark. Our code is open-sourced and available at https://github. com/BU-DEPEND-Lab/DRIBO.</description><subject>Control tasks</subject><subject>Deep learning</subject><subject>Representations</subject><subject>Robustness</subject><subject>Visual control</subject><subject>Visual tasks</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNyt0KgjAYgOERBEl5D4OOBd20aYdmUVAIEp3Kks-Y6Wb7qduvoAvo6D14nwnyCKVRkMaEzJBvTBeGIVkxkiTUQ2VRHfJyjSt1dcbiAmDEFQjZKt3AANLiI3Athbzhp-D45HorgouAFz58zcCtUBLnytoeJDT3BZq2vDfg_zpHy932vNkHo1YPB8bWnXJaflZN4ixmEWNpRv9Tb6UiPhQ</recordid><startdate>20220714</startdate><enddate>20220714</enddate><creator>Fan, Jiameng</creator><creator>Li, Wenchao</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20220714</creationdate><title>DRIBO: Robust Deep Reinforcement Learning via Multi-View Information Bottleneck</title><author>Fan, Jiameng ; Li, Wenchao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_24947177893</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Control tasks</topic><topic>Deep learning</topic><topic>Representations</topic><topic>Robustness</topic><topic>Visual control</topic><topic>Visual tasks</topic><toplevel>online_resources</toplevel><creatorcontrib>Fan, Jiameng</creatorcontrib><creatorcontrib>Li, Wenchao</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Fan, Jiameng</au><au>Li, Wenchao</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>DRIBO: Robust Deep Reinforcement Learning via Multi-View Information Bottleneck</atitle><jtitle>arXiv.org</jtitle><date>2022-07-14</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>Deep reinforcement learning (DRL) agents are often sensitive to visual changes that were unseen in their training environments. To address this problem, we leverage the sequential nature of RL to learn robust representations that encode only task-relevant information from observations based on the unsupervised multi-view setting. Specifically, we introduce a novel contrastive version of the Multi-View Information Bottleneck (MIB) objective for temporal data. We train RL agents from pixels with this auxiliary objective to learn robust representations that can compress away task-irrelevant information and are predictive of task-relevant dynamics. This approach enables us to train high-performance policies that are robust to visual distractions and can generalize well to unseen environments. We demonstrate that our approach can achieve SOTA performance on a diverse set of visual control tasks in the DeepMind Control Suite when the background is replaced with natural videos. In addition, we show that our approach outperforms well-established baselines for generalization to unseen environments on the Procgen benchmark. Our code is open-sourced and available at https://github. com/BU-DEPEND-Lab/DRIBO.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2022-07
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2494717789
source	Free E- Journals
subjects	Control tasks Deep learning Representations Robustness Visual control Visual tasks
title	DRIBO: Robust Deep Reinforcement Learning via Multi-View Information Bottleneck
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T05%3A52%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=DRIBO:%20Robust%20Deep%20Reinforcement%20Learning%20via%20Multi-View%20Information%20Bottleneck&rft.jtitle=arXiv.org&rft.au=Fan,%20Jiameng&rft.date=2022-07-14&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2494717789%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2494717789&rft_id=info:pmid/&rfr_iscdi=true