Chain Form Reinforcement Learning for Small-Memory Agent
In this paper, we propose Chain Form Reinforcement Learning for a reinforcement learning agent that has small memory. In the real world, learning is difficult because there are an infinite number of states and actions that need a large number of stored memories and learning times. To solve a problem...
Gespeichert in:
Veröffentlicht in: | Journal of Japan Society for Fuzzy Theory and Intelligent Informatics 2012/04/15, Vol.24(2), pp.691-696 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng ; jpn |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 696 |
---|---|
container_issue | 2 |
container_start_page | 691 |
container_title | Journal of Japan Society for Fuzzy Theory and Intelligent Informatics |
container_volume | 24 |
creator | NOTSU, Akira KOMORI, Yuki HONDA, Katsuhiro ICHIHASHI, Hidetomo IWAMOTO, Yuki |
description | In this paper, we propose Chain Form Reinforcement Learning for a reinforcement learning agent that has small memory. In the real world, learning is difficult because there are an infinite number of states and actions that need a large number of stored memories and learning times. To solve a problem, estimated values are categorized as “GOOD” or “NO GOOD” in the reinforcement learning process. Additionally, the alignment sequence of estimated values is changed as they are regarded as an important sequence themselves. We conducted some simulations and observed the influence of our methods. Several simulation results show no bad influence on learning speed. |
doi_str_mv | 10.3156/jsoft.24.691 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1038248496</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1038248496</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2091-524bcee54f639261600513483128d041a5e0a418d8bdfa17fa2c40adc1fa31f43</originalsourceid><addsrcrecordid>eNpFkE9PwzAMxSMEEmNw4wP0yIGOOEnT9DhNDJCGkPhzjrLU2Tq1zUi6w749gSK42Jbfz8_SI-Qa6IxDIe920bthxsRMVnBCJqAU5CWj_DTNXJR5WSl5Ti5i3FEqK1rAhKjF1jR9tvShy16x6Z0PFjvsh2yFJvRNv8nSKnvrTNvmz9j5cMzmm6RfkjNn2ohXv31KPpb374vHfPXy8LSYr3LLaAV5wcTaIhbCSV4xCZKmt1woDkzVVIApkBoBqlbr2hkonWFWUFNbcIaDE3xKbkbfffCfB4yD7pposW1Nj_4QNVCumFCikgm9HVEbfIwBnd6HpjPhmCD9HZD-CUgzoVNACZ-P-C4OZoN_sAlDY1v8h9lY0s2fZrcmaOz5F6HmcC4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1038248496</pqid></control><display><type>article</type><title>Chain Form Reinforcement Learning for Small-Memory Agent</title><source>J-STAGE (Japan Science & Technology Information Aggregator, Electronic) Freely Available Titles - Japanese</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>NOTSU, Akira ; KOMORI, Yuki ; HONDA, Katsuhiro ; ICHIHASHI, Hidetomo ; IWAMOTO, Yuki</creator><creatorcontrib>NOTSU, Akira ; KOMORI, Yuki ; HONDA, Katsuhiro ; ICHIHASHI, Hidetomo ; IWAMOTO, Yuki</creatorcontrib><description>In this paper, we propose Chain Form Reinforcement Learning for a reinforcement learning agent that has small memory. In the real world, learning is difficult because there are an infinite number of states and actions that need a large number of stored memories and learning times. To solve a problem, estimated values are categorized as “GOOD” or “NO GOOD” in the reinforcement learning process. Additionally, the alignment sequence of estimated values is changed as they are regarded as an important sequence themselves. We conducted some simulations and observed the influence of our methods. Several simulation results show no bad influence on learning speed.</description><identifier>ISSN: 1347-7986</identifier><identifier>EISSN: 1881-7203</identifier><identifier>DOI: 10.3156/jsoft.24.691</identifier><language>eng ; jpn</language><publisher>Japan Society for Fuzzy Theory and Intelligent Informatics</publisher><subject>Alignment ; Chains ; Fuzzy ; Fuzzy logic ; Fuzzy set theory ; Learning ; Q-learning ; Reinforcement ; Reinforcement learning ; Simulation ; State-Action set categorization</subject><ispartof>Journal of Japan Society for Fuzzy Theory and Intelligent Informatics, 2012/04/15, Vol.24(2), pp.691-696</ispartof><rights>2012 Japan Society for Fuzzy Theory and Intelligent Informatics</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c2091-524bcee54f639261600513483128d041a5e0a418d8bdfa17fa2c40adc1fa31f43</citedby><cites>FETCH-LOGICAL-c2091-524bcee54f639261600513483128d041a5e0a418d8bdfa17fa2c40adc1fa31f43</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,1883,27924,27925</link.rule.ids></links><search><creatorcontrib>NOTSU, Akira</creatorcontrib><creatorcontrib>KOMORI, Yuki</creatorcontrib><creatorcontrib>HONDA, Katsuhiro</creatorcontrib><creatorcontrib>ICHIHASHI, Hidetomo</creatorcontrib><creatorcontrib>IWAMOTO, Yuki</creatorcontrib><title>Chain Form Reinforcement Learning for Small-Memory Agent</title><title>Journal of Japan Society for Fuzzy Theory and Intelligent Informatics</title><addtitle>J. SOFT</addtitle><description>In this paper, we propose Chain Form Reinforcement Learning for a reinforcement learning agent that has small memory. In the real world, learning is difficult because there are an infinite number of states and actions that need a large number of stored memories and learning times. To solve a problem, estimated values are categorized as “GOOD” or “NO GOOD” in the reinforcement learning process. Additionally, the alignment sequence of estimated values is changed as they are regarded as an important sequence themselves. We conducted some simulations and observed the influence of our methods. Several simulation results show no bad influence on learning speed.</description><subject>Alignment</subject><subject>Chains</subject><subject>Fuzzy</subject><subject>Fuzzy logic</subject><subject>Fuzzy set theory</subject><subject>Learning</subject><subject>Q-learning</subject><subject>Reinforcement</subject><subject>Reinforcement learning</subject><subject>Simulation</subject><subject>State-Action set categorization</subject><issn>1347-7986</issn><issn>1881-7203</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><recordid>eNpFkE9PwzAMxSMEEmNw4wP0yIGOOEnT9DhNDJCGkPhzjrLU2Tq1zUi6w749gSK42Jbfz8_SI-Qa6IxDIe920bthxsRMVnBCJqAU5CWj_DTNXJR5WSl5Ti5i3FEqK1rAhKjF1jR9tvShy16x6Z0PFjvsh2yFJvRNv8nSKnvrTNvmz9j5cMzmm6RfkjNn2ohXv31KPpb374vHfPXy8LSYr3LLaAV5wcTaIhbCSV4xCZKmt1woDkzVVIApkBoBqlbr2hkonWFWUFNbcIaDE3xKbkbfffCfB4yD7pposW1Nj_4QNVCumFCikgm9HVEbfIwBnd6HpjPhmCD9HZD-CUgzoVNACZ-P-C4OZoN_sAlDY1v8h9lY0s2fZrcmaOz5F6HmcC4</recordid><startdate>20120101</startdate><enddate>20120101</enddate><creator>NOTSU, Akira</creator><creator>KOMORI, Yuki</creator><creator>HONDA, Katsuhiro</creator><creator>ICHIHASHI, Hidetomo</creator><creator>IWAMOTO, Yuki</creator><general>Japan Society for Fuzzy Theory and Intelligent Informatics</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20120101</creationdate><title>Chain Form Reinforcement Learning for Small-Memory Agent</title><author>NOTSU, Akira ; KOMORI, Yuki ; HONDA, Katsuhiro ; ICHIHASHI, Hidetomo ; IWAMOTO, Yuki</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2091-524bcee54f639261600513483128d041a5e0a418d8bdfa17fa2c40adc1fa31f43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng ; jpn</language><creationdate>2012</creationdate><topic>Alignment</topic><topic>Chains</topic><topic>Fuzzy</topic><topic>Fuzzy logic</topic><topic>Fuzzy set theory</topic><topic>Learning</topic><topic>Q-learning</topic><topic>Reinforcement</topic><topic>Reinforcement learning</topic><topic>Simulation</topic><topic>State-Action set categorization</topic><toplevel>online_resources</toplevel><creatorcontrib>NOTSU, Akira</creatorcontrib><creatorcontrib>KOMORI, Yuki</creatorcontrib><creatorcontrib>HONDA, Katsuhiro</creatorcontrib><creatorcontrib>ICHIHASHI, Hidetomo</creatorcontrib><creatorcontrib>IWAMOTO, Yuki</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Journal of Japan Society for Fuzzy Theory and Intelligent Informatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>NOTSU, Akira</au><au>KOMORI, Yuki</au><au>HONDA, Katsuhiro</au><au>ICHIHASHI, Hidetomo</au><au>IWAMOTO, Yuki</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Chain Form Reinforcement Learning for Small-Memory Agent</atitle><jtitle>Journal of Japan Society for Fuzzy Theory and Intelligent Informatics</jtitle><addtitle>J. SOFT</addtitle><date>2012-01-01</date><risdate>2012</risdate><volume>24</volume><issue>2</issue><spage>691</spage><epage>696</epage><pages>691-696</pages><issn>1347-7986</issn><eissn>1881-7203</eissn><abstract>In this paper, we propose Chain Form Reinforcement Learning for a reinforcement learning agent that has small memory. In the real world, learning is difficult because there are an infinite number of states and actions that need a large number of stored memories and learning times. To solve a problem, estimated values are categorized as “GOOD” or “NO GOOD” in the reinforcement learning process. Additionally, the alignment sequence of estimated values is changed as they are regarded as an important sequence themselves. We conducted some simulations and observed the influence of our methods. Several simulation results show no bad influence on learning speed.</abstract><pub>Japan Society for Fuzzy Theory and Intelligent Informatics</pub><doi>10.3156/jsoft.24.691</doi><tpages>6</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1347-7986 |
ispartof | Journal of Japan Society for Fuzzy Theory and Intelligent Informatics, 2012/04/15, Vol.24(2), pp.691-696 |
issn | 1347-7986 1881-7203 |
language | eng ; jpn |
recordid | cdi_proquest_miscellaneous_1038248496 |
source | J-STAGE (Japan Science & Technology Information Aggregator, Electronic) Freely Available Titles - Japanese; EZB-FREE-00999 freely available EZB journals |
subjects | Alignment Chains Fuzzy Fuzzy logic Fuzzy set theory Learning Q-learning Reinforcement Reinforcement learning Simulation State-Action set categorization |
title | Chain Form Reinforcement Learning for Small-Memory Agent |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T00%3A22%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Chain%20Form%20Reinforcement%20Learning%20for%20Small-Memory%20Agent&rft.jtitle=Journal%20of%20Japan%20Society%20for%20Fuzzy%20Theory%20and%20Intelligent%20Informatics&rft.au=NOTSU,%20Akira&rft.date=2012-01-01&rft.volume=24&rft.issue=2&rft.spage=691&rft.epage=696&rft.pages=691-696&rft.issn=1347-7986&rft.eissn=1881-7203&rft_id=info:doi/10.3156/jsoft.24.691&rft_dat=%3Cproquest_cross%3E1038248496%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1038248496&rft_id=info:pmid/&rfr_iscdi=true |