PcLast: Discovering Plannable Continuous Latent States

Goal-conditioned planning benefits from learned low-dimensional representations of rich observations. While compact latent representations typically learned from variational autoencoders or inverse dynamics enable goal-conditioned decision making, they ignore state reachability, hampering their perf...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Koul, Anurag, Sujit, Shivakanth, Chen, Shaoru, Evans, Ben, Wu, Lili, Xu, Byron, Chari, Rajan, Islam, Riashat, Seraj, Raihan, Efroni, Yonathan, Molu, Lekan, Dudik, Miro, Langford, John, Lamb, Alex
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Koul, Anurag
Sujit, Shivakanth
Chen, Shaoru
Evans, Ben
Wu, Lili
Xu, Byron
Chari, Rajan
Islam, Riashat
Seraj, Raihan
Efroni, Yonathan
Molu, Lekan
Dudik, Miro
Langford, John
Lamb, Alex
description Goal-conditioned planning benefits from learned low-dimensional representations of rich observations. While compact latent representations typically learned from variational autoencoders or inverse dynamics enable goal-conditioned decision making, they ignore state reachability, hampering their performance. In this paper, we learn a representation that associates reachable states together for effective planning and goal-conditioned policy learning. We first learn a latent representation with multi-step inverse dynamics (to remove distracting information), and then transform this representation to associate reachable states together in $\ell_2$ space. Our proposals are rigorously tested in various simulation testbeds. Numerical results in reward-based settings show significant improvements in sampling efficiency. Further, in reward-free settings this approach yields layered state abstractions that enable computationally efficient hierarchical planning for reaching ad hoc goals with zero additional samples.
doi_str_mv 10.48550/arxiv.2311.03534
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2311_03534</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2311_03534</sourcerecordid><originalsourceid>FETCH-LOGICAL-a674-bbc8dda02d324a9b49628fba27bbb7ded4b262031b956695cd878826488e6b823</originalsourceid><addsrcrecordid>eNotj0tqwzAUALXJoiQ9QFfVBezKTx_L2QX3C4YEmr15z1KCwJGLpYT29m3TrmY3zDB2V4lSWa3FA86f4VKCrKpSSC3VDTO7ocOU1_wxpGG6-DnEI9-NGCPS6Hk7xRzieTon3mH2MfP3_MO0YosDjsnf_nPJ9s9P-_a16LYvb-2mK9DUqiAarHMowElQ2JBqDNgDIdREVDvvFIEBIStqtDGNHpytrQWjrPWGLMglu__TXsP7jzmccP7qfwf664D8BgDHP_g</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>PcLast: Discovering Plannable Continuous Latent States</title><source>arXiv.org</source><creator>Koul, Anurag ; Sujit, Shivakanth ; Chen, Shaoru ; Evans, Ben ; Wu, Lili ; Xu, Byron ; Chari, Rajan ; Islam, Riashat ; Seraj, Raihan ; Efroni, Yonathan ; Molu, Lekan ; Dudik, Miro ; Langford, John ; Lamb, Alex</creator><creatorcontrib>Koul, Anurag ; Sujit, Shivakanth ; Chen, Shaoru ; Evans, Ben ; Wu, Lili ; Xu, Byron ; Chari, Rajan ; Islam, Riashat ; Seraj, Raihan ; Efroni, Yonathan ; Molu, Lekan ; Dudik, Miro ; Langford, John ; Lamb, Alex</creatorcontrib><description>Goal-conditioned planning benefits from learned low-dimensional representations of rich observations. While compact latent representations typically learned from variational autoencoders or inverse dynamics enable goal-conditioned decision making, they ignore state reachability, hampering their performance. In this paper, we learn a representation that associates reachable states together for effective planning and goal-conditioned policy learning. We first learn a latent representation with multi-step inverse dynamics (to remove distracting information), and then transform this representation to associate reachable states together in $\ell_2$ space. Our proposals are rigorously tested in various simulation testbeds. Numerical results in reward-based settings show significant improvements in sampling efficiency. Further, in reward-free settings this approach yields layered state abstractions that enable computationally efficient hierarchical planning for reaching ad hoc goals with zero additional samples.</description><identifier>DOI: 10.48550/arxiv.2311.03534</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Learning ; Computer Science - Robotics</subject><creationdate>2023-11</creationdate><rights>http://creativecommons.org/licenses/by-nc-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2311.03534$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2311.03534$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Koul, Anurag</creatorcontrib><creatorcontrib>Sujit, Shivakanth</creatorcontrib><creatorcontrib>Chen, Shaoru</creatorcontrib><creatorcontrib>Evans, Ben</creatorcontrib><creatorcontrib>Wu, Lili</creatorcontrib><creatorcontrib>Xu, Byron</creatorcontrib><creatorcontrib>Chari, Rajan</creatorcontrib><creatorcontrib>Islam, Riashat</creatorcontrib><creatorcontrib>Seraj, Raihan</creatorcontrib><creatorcontrib>Efroni, Yonathan</creatorcontrib><creatorcontrib>Molu, Lekan</creatorcontrib><creatorcontrib>Dudik, Miro</creatorcontrib><creatorcontrib>Langford, John</creatorcontrib><creatorcontrib>Lamb, Alex</creatorcontrib><title>PcLast: Discovering Plannable Continuous Latent States</title><description>Goal-conditioned planning benefits from learned low-dimensional representations of rich observations. While compact latent representations typically learned from variational autoencoders or inverse dynamics enable goal-conditioned decision making, they ignore state reachability, hampering their performance. In this paper, we learn a representation that associates reachable states together for effective planning and goal-conditioned policy learning. We first learn a latent representation with multi-step inverse dynamics (to remove distracting information), and then transform this representation to associate reachable states together in $\ell_2$ space. Our proposals are rigorously tested in various simulation testbeds. Numerical results in reward-based settings show significant improvements in sampling efficiency. Further, in reward-free settings this approach yields layered state abstractions that enable computationally efficient hierarchical planning for reaching ad hoc goals with zero additional samples.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Learning</subject><subject>Computer Science - Robotics</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj0tqwzAUALXJoiQ9QFfVBezKTx_L2QX3C4YEmr15z1KCwJGLpYT29m3TrmY3zDB2V4lSWa3FA86f4VKCrKpSSC3VDTO7ocOU1_wxpGG6-DnEI9-NGCPS6Hk7xRzieTon3mH2MfP3_MO0YosDjsnf_nPJ9s9P-_a16LYvb-2mK9DUqiAarHMowElQ2JBqDNgDIdREVDvvFIEBIStqtDGNHpytrQWjrPWGLMglu__TXsP7jzmccP7qfwf664D8BgDHP_g</recordid><startdate>20231106</startdate><enddate>20231106</enddate><creator>Koul, Anurag</creator><creator>Sujit, Shivakanth</creator><creator>Chen, Shaoru</creator><creator>Evans, Ben</creator><creator>Wu, Lili</creator><creator>Xu, Byron</creator><creator>Chari, Rajan</creator><creator>Islam, Riashat</creator><creator>Seraj, Raihan</creator><creator>Efroni, Yonathan</creator><creator>Molu, Lekan</creator><creator>Dudik, Miro</creator><creator>Langford, John</creator><creator>Lamb, Alex</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20231106</creationdate><title>PcLast: Discovering Plannable Continuous Latent States</title><author>Koul, Anurag ; Sujit, Shivakanth ; Chen, Shaoru ; Evans, Ben ; Wu, Lili ; Xu, Byron ; Chari, Rajan ; Islam, Riashat ; Seraj, Raihan ; Efroni, Yonathan ; Molu, Lekan ; Dudik, Miro ; Langford, John ; Lamb, Alex</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a674-bbc8dda02d324a9b49628fba27bbb7ded4b262031b956695cd878826488e6b823</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Learning</topic><topic>Computer Science - Robotics</topic><toplevel>online_resources</toplevel><creatorcontrib>Koul, Anurag</creatorcontrib><creatorcontrib>Sujit, Shivakanth</creatorcontrib><creatorcontrib>Chen, Shaoru</creatorcontrib><creatorcontrib>Evans, Ben</creatorcontrib><creatorcontrib>Wu, Lili</creatorcontrib><creatorcontrib>Xu, Byron</creatorcontrib><creatorcontrib>Chari, Rajan</creatorcontrib><creatorcontrib>Islam, Riashat</creatorcontrib><creatorcontrib>Seraj, Raihan</creatorcontrib><creatorcontrib>Efroni, Yonathan</creatorcontrib><creatorcontrib>Molu, Lekan</creatorcontrib><creatorcontrib>Dudik, Miro</creatorcontrib><creatorcontrib>Langford, John</creatorcontrib><creatorcontrib>Lamb, Alex</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Koul, Anurag</au><au>Sujit, Shivakanth</au><au>Chen, Shaoru</au><au>Evans, Ben</au><au>Wu, Lili</au><au>Xu, Byron</au><au>Chari, Rajan</au><au>Islam, Riashat</au><au>Seraj, Raihan</au><au>Efroni, Yonathan</au><au>Molu, Lekan</au><au>Dudik, Miro</au><au>Langford, John</au><au>Lamb, Alex</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PcLast: Discovering Plannable Continuous Latent States</atitle><date>2023-11-06</date><risdate>2023</risdate><abstract>Goal-conditioned planning benefits from learned low-dimensional representations of rich observations. While compact latent representations typically learned from variational autoencoders or inverse dynamics enable goal-conditioned decision making, they ignore state reachability, hampering their performance. In this paper, we learn a representation that associates reachable states together for effective planning and goal-conditioned policy learning. We first learn a latent representation with multi-step inverse dynamics (to remove distracting information), and then transform this representation to associate reachable states together in $\ell_2$ space. Our proposals are rigorously tested in various simulation testbeds. Numerical results in reward-based settings show significant improvements in sampling efficiency. Further, in reward-free settings this approach yields layered state abstractions that enable computationally efficient hierarchical planning for reaching ad hoc goals with zero additional samples.</abstract><doi>10.48550/arxiv.2311.03534</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2311.03534
ispartof
issn
language eng
recordid cdi_arxiv_primary_2311_03534
source arXiv.org
subjects Computer Science - Artificial Intelligence
Computer Science - Learning
Computer Science - Robotics
title PcLast: Discovering Plannable Continuous Latent States
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T08%3A06%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PcLast:%20Discovering%20Plannable%20Continuous%20Latent%20States&rft.au=Koul,%20Anurag&rft.date=2023-11-06&rft_id=info:doi/10.48550/arxiv.2311.03534&rft_dat=%3Carxiv_GOX%3E2311_03534%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true