Chain Form Reinforcement Learning for Small-Memory Agent

In this paper, we propose Chain Form Reinforcement Learning for a reinforcement learning agent that has small memory. In the real world, learning is difficult because there are an infinite number of states and actions that need a large number of stored memories and learning times. To solve a problem...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of Japan Society for Fuzzy Theory and Intelligent Informatics 2012/04/15, Vol.24(2), pp.691-696
Hauptverfasser: NOTSU, Akira, KOMORI, Yuki, HONDA, Katsuhiro, ICHIHASHI, Hidetomo, IWAMOTO, Yuki
Format: Artikel
Sprache:eng ; jpn
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 696
container_issue 2
container_start_page 691
container_title Journal of Japan Society for Fuzzy Theory and Intelligent Informatics
container_volume 24
creator NOTSU, Akira
KOMORI, Yuki
HONDA, Katsuhiro
ICHIHASHI, Hidetomo
IWAMOTO, Yuki
description In this paper, we propose Chain Form Reinforcement Learning for a reinforcement learning agent that has small memory. In the real world, learning is difficult because there are an infinite number of states and actions that need a large number of stored memories and learning times. To solve a problem, estimated values are categorized as “GOOD” or “NO GOOD” in the reinforcement learning process. Additionally, the alignment sequence of estimated values is changed as they are regarded as an important sequence themselves. We conducted some simulations and observed the influence of our methods. Several simulation results show no bad influence on learning speed.
doi_str_mv 10.3156/jsoft.24.691
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1038248496</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1038248496</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2091-524bcee54f639261600513483128d041a5e0a418d8bdfa17fa2c40adc1fa31f43</originalsourceid><addsrcrecordid>eNpFkE9PwzAMxSMEEmNw4wP0yIGOOEnT9DhNDJCGkPhzjrLU2Tq1zUi6w749gSK42Jbfz8_SI-Qa6IxDIe920bthxsRMVnBCJqAU5CWj_DTNXJR5WSl5Ti5i3FEqK1rAhKjF1jR9tvShy16x6Z0PFjvsh2yFJvRNv8nSKnvrTNvmz9j5cMzmm6RfkjNn2ohXv31KPpb374vHfPXy8LSYr3LLaAV5wcTaIhbCSV4xCZKmt1woDkzVVIApkBoBqlbr2hkonWFWUFNbcIaDE3xKbkbfffCfB4yD7pposW1Nj_4QNVCumFCikgm9HVEbfIwBnd6HpjPhmCD9HZD-CUgzoVNACZ-P-C4OZoN_sAlDY1v8h9lY0s2fZrcmaOz5F6HmcC4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1038248496</pqid></control><display><type>article</type><title>Chain Form Reinforcement Learning for Small-Memory Agent</title><source>J-STAGE (Japan Science &amp; Technology Information Aggregator, Electronic) Freely Available Titles - Japanese</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>NOTSU, Akira ; KOMORI, Yuki ; HONDA, Katsuhiro ; ICHIHASHI, Hidetomo ; IWAMOTO, Yuki</creator><creatorcontrib>NOTSU, Akira ; KOMORI, Yuki ; HONDA, Katsuhiro ; ICHIHASHI, Hidetomo ; IWAMOTO, Yuki</creatorcontrib><description>In this paper, we propose Chain Form Reinforcement Learning for a reinforcement learning agent that has small memory. In the real world, learning is difficult because there are an infinite number of states and actions that need a large number of stored memories and learning times. To solve a problem, estimated values are categorized as “GOOD” or “NO GOOD” in the reinforcement learning process. Additionally, the alignment sequence of estimated values is changed as they are regarded as an important sequence themselves. We conducted some simulations and observed the influence of our methods. Several simulation results show no bad influence on learning speed.</description><identifier>ISSN: 1347-7986</identifier><identifier>EISSN: 1881-7203</identifier><identifier>DOI: 10.3156/jsoft.24.691</identifier><language>eng ; jpn</language><publisher>Japan Society for Fuzzy Theory and Intelligent Informatics</publisher><subject>Alignment ; Chains ; Fuzzy ; Fuzzy logic ; Fuzzy set theory ; Learning ; Q-learning ; Reinforcement ; Reinforcement learning ; Simulation ; State-Action set categorization</subject><ispartof>Journal of Japan Society for Fuzzy Theory and Intelligent Informatics, 2012/04/15, Vol.24(2), pp.691-696</ispartof><rights>2012 Japan Society for Fuzzy Theory and Intelligent Informatics</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c2091-524bcee54f639261600513483128d041a5e0a418d8bdfa17fa2c40adc1fa31f43</citedby><cites>FETCH-LOGICAL-c2091-524bcee54f639261600513483128d041a5e0a418d8bdfa17fa2c40adc1fa31f43</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,1883,27924,27925</link.rule.ids></links><search><creatorcontrib>NOTSU, Akira</creatorcontrib><creatorcontrib>KOMORI, Yuki</creatorcontrib><creatorcontrib>HONDA, Katsuhiro</creatorcontrib><creatorcontrib>ICHIHASHI, Hidetomo</creatorcontrib><creatorcontrib>IWAMOTO, Yuki</creatorcontrib><title>Chain Form Reinforcement Learning for Small-Memory Agent</title><title>Journal of Japan Society for Fuzzy Theory and Intelligent Informatics</title><addtitle>J. SOFT</addtitle><description>In this paper, we propose Chain Form Reinforcement Learning for a reinforcement learning agent that has small memory. In the real world, learning is difficult because there are an infinite number of states and actions that need a large number of stored memories and learning times. To solve a problem, estimated values are categorized as “GOOD” or “NO GOOD” in the reinforcement learning process. Additionally, the alignment sequence of estimated values is changed as they are regarded as an important sequence themselves. We conducted some simulations and observed the influence of our methods. Several simulation results show no bad influence on learning speed.</description><subject>Alignment</subject><subject>Chains</subject><subject>Fuzzy</subject><subject>Fuzzy logic</subject><subject>Fuzzy set theory</subject><subject>Learning</subject><subject>Q-learning</subject><subject>Reinforcement</subject><subject>Reinforcement learning</subject><subject>Simulation</subject><subject>State-Action set categorization</subject><issn>1347-7986</issn><issn>1881-7203</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><recordid>eNpFkE9PwzAMxSMEEmNw4wP0yIGOOEnT9DhNDJCGkPhzjrLU2Tq1zUi6w749gSK42Jbfz8_SI-Qa6IxDIe920bthxsRMVnBCJqAU5CWj_DTNXJR5WSl5Ti5i3FEqK1rAhKjF1jR9tvShy16x6Z0PFjvsh2yFJvRNv8nSKnvrTNvmz9j5cMzmm6RfkjNn2ohXv31KPpb374vHfPXy8LSYr3LLaAV5wcTaIhbCSV4xCZKmt1woDkzVVIApkBoBqlbr2hkonWFWUFNbcIaDE3xKbkbfffCfB4yD7pposW1Nj_4QNVCumFCikgm9HVEbfIwBnd6HpjPhmCD9HZD-CUgzoVNACZ-P-C4OZoN_sAlDY1v8h9lY0s2fZrcmaOz5F6HmcC4</recordid><startdate>20120101</startdate><enddate>20120101</enddate><creator>NOTSU, Akira</creator><creator>KOMORI, Yuki</creator><creator>HONDA, Katsuhiro</creator><creator>ICHIHASHI, Hidetomo</creator><creator>IWAMOTO, Yuki</creator><general>Japan Society for Fuzzy Theory and Intelligent Informatics</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20120101</creationdate><title>Chain Form Reinforcement Learning for Small-Memory Agent</title><author>NOTSU, Akira ; KOMORI, Yuki ; HONDA, Katsuhiro ; ICHIHASHI, Hidetomo ; IWAMOTO, Yuki</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2091-524bcee54f639261600513483128d041a5e0a418d8bdfa17fa2c40adc1fa31f43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng ; jpn</language><creationdate>2012</creationdate><topic>Alignment</topic><topic>Chains</topic><topic>Fuzzy</topic><topic>Fuzzy logic</topic><topic>Fuzzy set theory</topic><topic>Learning</topic><topic>Q-learning</topic><topic>Reinforcement</topic><topic>Reinforcement learning</topic><topic>Simulation</topic><topic>State-Action set categorization</topic><toplevel>online_resources</toplevel><creatorcontrib>NOTSU, Akira</creatorcontrib><creatorcontrib>KOMORI, Yuki</creatorcontrib><creatorcontrib>HONDA, Katsuhiro</creatorcontrib><creatorcontrib>ICHIHASHI, Hidetomo</creatorcontrib><creatorcontrib>IWAMOTO, Yuki</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Journal of Japan Society for Fuzzy Theory and Intelligent Informatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>NOTSU, Akira</au><au>KOMORI, Yuki</au><au>HONDA, Katsuhiro</au><au>ICHIHASHI, Hidetomo</au><au>IWAMOTO, Yuki</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Chain Form Reinforcement Learning for Small-Memory Agent</atitle><jtitle>Journal of Japan Society for Fuzzy Theory and Intelligent Informatics</jtitle><addtitle>J. SOFT</addtitle><date>2012-01-01</date><risdate>2012</risdate><volume>24</volume><issue>2</issue><spage>691</spage><epage>696</epage><pages>691-696</pages><issn>1347-7986</issn><eissn>1881-7203</eissn><abstract>In this paper, we propose Chain Form Reinforcement Learning for a reinforcement learning agent that has small memory. In the real world, learning is difficult because there are an infinite number of states and actions that need a large number of stored memories and learning times. To solve a problem, estimated values are categorized as “GOOD” or “NO GOOD” in the reinforcement learning process. Additionally, the alignment sequence of estimated values is changed as they are regarded as an important sequence themselves. We conducted some simulations and observed the influence of our methods. Several simulation results show no bad influence on learning speed.</abstract><pub>Japan Society for Fuzzy Theory and Intelligent Informatics</pub><doi>10.3156/jsoft.24.691</doi><tpages>6</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1347-7986
ispartof Journal of Japan Society for Fuzzy Theory and Intelligent Informatics, 2012/04/15, Vol.24(2), pp.691-696
issn 1347-7986
1881-7203
language eng ; jpn
recordid cdi_proquest_miscellaneous_1038248496
source J-STAGE (Japan Science & Technology Information Aggregator, Electronic) Freely Available Titles - Japanese; EZB-FREE-00999 freely available EZB journals
subjects Alignment
Chains
Fuzzy
Fuzzy logic
Fuzzy set theory
Learning
Q-learning
Reinforcement
Reinforcement learning
Simulation
State-Action set categorization
title Chain Form Reinforcement Learning for Small-Memory Agent
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T00%3A22%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Chain%20Form%20Reinforcement%20Learning%20for%20Small-Memory%20Agent&rft.jtitle=Journal%20of%20Japan%20Society%20for%20Fuzzy%20Theory%20and%20Intelligent%20Informatics&rft.au=NOTSU,%20Akira&rft.date=2012-01-01&rft.volume=24&rft.issue=2&rft.spage=691&rft.epage=696&rft.pages=691-696&rft.issn=1347-7986&rft.eissn=1881-7203&rft_id=info:doi/10.3156/jsoft.24.691&rft_dat=%3Cproquest_cross%3E1038248496%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1038248496&rft_id=info:pmid/&rfr_iscdi=true