Deep Reinforcement Learning With Macro-Actions

Deep reinforcement learning has been shown to be a powerful framework for learning policies from complex high-dimensional sensory inputs to actions in complex tasks, such as the Atari domain. In this paper, we explore output representation modeling in the form of temporal abstraction to improve conv...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Durugkar, Ishan P, Rosenbaum, Clemens, Dernbach, Stefan, Mahadevan, Sridhar
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Learning Computer Science - Neural and Evolutionary Computing
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Durugkar, Ishan P Rosenbaum, Clemens Dernbach, Stefan Mahadevan, Sridhar
description	Deep reinforcement learning has been shown to be a powerful framework for learning policies from complex high-dimensional sensory inputs to actions in complex tasks, such as the Atari domain. In this paper, we explore output representation modeling in the form of temporal abstraction to improve convergence and reliability of deep reinforcement learning approaches. We concentrate on macro-actions, and evaluate these on different Atari 2600 games, where we show that they yield significant improvements in learning speed. Additionally, we show that they can even achieve better scores than DQN. We offer analysis and explanation for both convergence and final results, revealing a problem deep RL approaches have with sparse reward signals.
doi_str_mv	10.48550/arxiv.1606.04615
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1606_04615</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1606_04615</sourcerecordid><originalsourceid>FETCH-LOGICAL-a675-95859ca4c4d7cd3332f2b989b801e89ac43e2b23d31fce09269d2bb764c627af3</originalsourceid><addsrcrecordid>eNotzstqwkAUgOHZdFFsH6Ar8wJJ556ZpdiLQqRQBJfhzMkZHagTGUNp375UXf27n4-xJ8Eb7Yzhz1B-0ncjLLcN11aYe9a8EJ2qT0o5jgXpSHmqOoKSU95XuzQdqg1gGesFTmnM5wd2F-HrTI-3ztj27XW7XNXdx_t6uehqsK2pvXHGI2jUQ4uDUkpGGbzzwXFBzgNqRTJINSgRkbiX1g8yhNZqtLKFqGZsft1ewP2ppCOU3_4f3l_g6g9gADzL</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Deep Reinforcement Learning With Macro-Actions</title><source>arXiv.org</source><creator>Durugkar, Ishan P ; Rosenbaum, Clemens ; Dernbach, Stefan ; Mahadevan, Sridhar</creator><creatorcontrib>Durugkar, Ishan P ; Rosenbaum, Clemens ; Dernbach, Stefan ; Mahadevan, Sridhar</creatorcontrib><description>Deep reinforcement learning has been shown to be a powerful framework for learning policies from complex high-dimensional sensory inputs to actions in complex tasks, such as the Atari domain. In this paper, we explore output representation modeling in the form of temporal abstraction to improve convergence and reliability of deep reinforcement learning approaches. We concentrate on macro-actions, and evaluate these on different Atari 2600 games, where we show that they yield significant improvements in learning speed. Additionally, we show that they can even achieve better scores than DQN. We offer analysis and explanation for both convergence and final results, revealing a problem deep RL approaches have with sparse reward signals.</description><identifier>DOI: 10.48550/arxiv.1606.04615</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Learning ; Computer Science - Neural and Evolutionary Computing</subject><creationdate>2016-06</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,778,883</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1606.04615$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1606.04615$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Durugkar, Ishan P</creatorcontrib><creatorcontrib>Rosenbaum, Clemens</creatorcontrib><creatorcontrib>Dernbach, Stefan</creatorcontrib><creatorcontrib>Mahadevan, Sridhar</creatorcontrib><title>Deep Reinforcement Learning With Macro-Actions</title><description>Deep reinforcement learning has been shown to be a powerful framework for learning policies from complex high-dimensional sensory inputs to actions in complex tasks, such as the Atari domain. In this paper, we explore output representation modeling in the form of temporal abstraction to improve convergence and reliability of deep reinforcement learning approaches. We concentrate on macro-actions, and evaluate these on different Atari 2600 games, where we show that they yield significant improvements in learning speed. Additionally, we show that they can even achieve better scores than DQN. We offer analysis and explanation for both convergence and final results, revealing a problem deep RL approaches have with sparse reward signals.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Learning</subject><subject>Computer Science - Neural and Evolutionary Computing</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotzstqwkAUgOHZdFFsH6Ar8wJJ556ZpdiLQqRQBJfhzMkZHagTGUNp375UXf27n4-xJ8Eb7Yzhz1B-0ncjLLcN11aYe9a8EJ2qT0o5jgXpSHmqOoKSU95XuzQdqg1gGesFTmnM5wd2F-HrTI-3ztj27XW7XNXdx_t6uehqsK2pvXHGI2jUQ4uDUkpGGbzzwXFBzgNqRTJINSgRkbiX1g8yhNZqtLKFqGZsft1ewP2ppCOU3_4f3l_g6g9gADzL</recordid><startdate>20160614</startdate><enddate>20160614</enddate><creator>Durugkar, Ishan P</creator><creator>Rosenbaum, Clemens</creator><creator>Dernbach, Stefan</creator><creator>Mahadevan, Sridhar</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20160614</creationdate><title>Deep Reinforcement Learning With Macro-Actions</title><author>Durugkar, Ishan P ; Rosenbaum, Clemens ; Dernbach, Stefan ; Mahadevan, Sridhar</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a675-95859ca4c4d7cd3332f2b989b801e89ac43e2b23d31fce09269d2bb764c627af3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Learning</topic><topic>Computer Science - Neural and Evolutionary Computing</topic><toplevel>online_resources</toplevel><creatorcontrib>Durugkar, Ishan P</creatorcontrib><creatorcontrib>Rosenbaum, Clemens</creatorcontrib><creatorcontrib>Dernbach, Stefan</creatorcontrib><creatorcontrib>Mahadevan, Sridhar</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Durugkar, Ishan P</au><au>Rosenbaum, Clemens</au><au>Dernbach, Stefan</au><au>Mahadevan, Sridhar</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep Reinforcement Learning With Macro-Actions</atitle><date>2016-06-14</date><risdate>2016</risdate><abstract>Deep reinforcement learning has been shown to be a powerful framework for learning policies from complex high-dimensional sensory inputs to actions in complex tasks, such as the Atari domain. In this paper, we explore output representation modeling in the form of temporal abstraction to improve convergence and reliability of deep reinforcement learning approaches. We concentrate on macro-actions, and evaluate these on different Atari 2600 games, where we show that they yield significant improvements in learning speed. Additionally, we show that they can even achieve better scores than DQN. We offer analysis and explanation for both convergence and final results, revealing a problem deep RL approaches have with sparse reward signals.</abstract><doi>10.48550/arxiv.1606.04615</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.1606.04615
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_1606_04615
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Learning Computer Science - Neural and Evolutionary Computing
title	Deep Reinforcement Learning With Macro-Actions
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T09%3A28%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep%20Reinforcement%20Learning%20With%20Macro-Actions&rft.au=Durugkar,%20Ishan%20P&rft.date=2016-06-14&rft_id=info:doi/10.48550/arxiv.1606.04615&rft_dat=%3Carxiv_GOX%3E1606_04615%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true