Reinforcement Learning for Adaptive Optimal Stationary Control of Linear Stochastic Systems
This article studies the adaptive optimal stationary control of continuous-time linear stochastic systems with both additive and multiplicative noises, using reinforcement learning techniques. Based on policy iteration, a novel off-policy reinforcement learning algorithm, named optimistic least-squa...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on automatic control 2023-04, Vol.68 (4), p.2383-2390 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 2390 |
---|---|
container_issue | 4 |
container_start_page | 2383 |
container_title | IEEE transactions on automatic control |
container_volume | 68 |
creator | Pang, Bo Jiang, Zhong-Ping |
description | This article studies the adaptive optimal stationary control of continuous-time linear stochastic systems with both additive and multiplicative noises, using reinforcement learning techniques. Based on policy iteration, a novel off-policy reinforcement learning algorithm, named optimistic least-squares-based policy iteration, is proposed, which is able to find iteratively near-optimal policies of the adaptive optimal stationary control problem directly from input/state data without explicitly identifying any system matrices, starting from an initial admissible control policy. The solutions given by the proposed optimistic least-squares-based policy iteration are proved to converge to a small neighborhood of the optimal solution with probability one, under mild conditions. The application of the proposed algorithm to a triple inverted pendulum example validates its feasibility and effectiveness. |
doi_str_mv | 10.1109/TAC.2022.3172250 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2792135156</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9767669</ieee_id><sourcerecordid>2792135156</sourcerecordid><originalsourceid>FETCH-LOGICAL-c333t-862dc3356f56dbaf0fb83e7b08b513f816c4b0268988d3b917ea19e9b49b80323</originalsourceid><addsrcrecordid>eNo9kEtLAzEUhYMoWKt7wU3A9dQ8JplkWQZfMFCwdeUiJDOJTmmTmkSh_96UFlf3dc693A-AW4xmGCP5sJq3M4IImVHcEMLQGZhgxkRFGKHnYIIQFpUkgl-Cq5TWpeR1jSfg482O3oXY2631GXZWRz_6T1hacD7oXR5_LVyUsNUbuMw6j8HruIdt8DmGDQwOdqMvrjIM_ZdOeezhcp-y3aZrcOH0JtmbU5yC96fHVftSdYvn13beVT2lNFeCk6FkjDvGB6MdckZQ2xgkDMPUCcz72iDChRRioEbixmosrTS1NAJRQqfg_rh3F8P3j01ZrcNP9OWkIo0kmDLMeFGho6qPIaVondrF8lXcK4zUAaEqCNUBoTohLJa7o2W01v7LZcMbziX9A-OIbOQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2792135156</pqid></control><display><type>article</type><title>Reinforcement Learning for Adaptive Optimal Stationary Control of Linear Stochastic Systems</title><source>IEEE Electronic Library (IEL)</source><creator>Pang, Bo ; Jiang, Zhong-Ping</creator><creatorcontrib>Pang, Bo ; Jiang, Zhong-Ping</creatorcontrib><description>This article studies the adaptive optimal stationary control of continuous-time linear stochastic systems with both additive and multiplicative noises, using reinforcement learning techniques. Based on policy iteration, a novel off-policy reinforcement learning algorithm, named optimistic least-squares-based policy iteration, is proposed, which is able to find iteratively near-optimal policies of the adaptive optimal stationary control problem directly from input/state data without explicitly identifying any system matrices, starting from an initial admissible control policy. The solutions given by the proposed optimistic least-squares-based policy iteration are proved to converge to a small neighborhood of the optimal solution with probability one, under mild conditions. The application of the proposed algorithm to a triple inverted pendulum example validates its feasibility and effectiveness.</description><identifier>ISSN: 0018-9286</identifier><identifier>EISSN: 1558-2523</identifier><identifier>DOI: 10.1109/TAC.2022.3172250</identifier><identifier>CODEN: IETAA9</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Adaptive control ; Adaptive optimal control ; Algorithms ; data-driven control ; Heuristic algorithms ; Least squares ; Machine learning ; Optimal control ; Performance analysis ; policy iteration ; Process control ; Reinforcement learning ; robustness ; stochastic control ; Stochastic processes ; Stochastic systems</subject><ispartof>IEEE transactions on automatic control, 2023-04, Vol.68 (4), p.2383-2390</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c333t-862dc3356f56dbaf0fb83e7b08b513f816c4b0268988d3b917ea19e9b49b80323</citedby><cites>FETCH-LOGICAL-c333t-862dc3356f56dbaf0fb83e7b08b513f816c4b0268988d3b917ea19e9b49b80323</cites><orcidid>0000-0002-4359-2937 ; 0000-0002-4868-9359</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9767669$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9767669$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Pang, Bo</creatorcontrib><creatorcontrib>Jiang, Zhong-Ping</creatorcontrib><title>Reinforcement Learning for Adaptive Optimal Stationary Control of Linear Stochastic Systems</title><title>IEEE transactions on automatic control</title><addtitle>TAC</addtitle><description>This article studies the adaptive optimal stationary control of continuous-time linear stochastic systems with both additive and multiplicative noises, using reinforcement learning techniques. Based on policy iteration, a novel off-policy reinforcement learning algorithm, named optimistic least-squares-based policy iteration, is proposed, which is able to find iteratively near-optimal policies of the adaptive optimal stationary control problem directly from input/state data without explicitly identifying any system matrices, starting from an initial admissible control policy. The solutions given by the proposed optimistic least-squares-based policy iteration are proved to converge to a small neighborhood of the optimal solution with probability one, under mild conditions. The application of the proposed algorithm to a triple inverted pendulum example validates its feasibility and effectiveness.</description><subject>Adaptive control</subject><subject>Adaptive optimal control</subject><subject>Algorithms</subject><subject>data-driven control</subject><subject>Heuristic algorithms</subject><subject>Least squares</subject><subject>Machine learning</subject><subject>Optimal control</subject><subject>Performance analysis</subject><subject>policy iteration</subject><subject>Process control</subject><subject>Reinforcement learning</subject><subject>robustness</subject><subject>stochastic control</subject><subject>Stochastic processes</subject><subject>Stochastic systems</subject><issn>0018-9286</issn><issn>1558-2523</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kEtLAzEUhYMoWKt7wU3A9dQ8JplkWQZfMFCwdeUiJDOJTmmTmkSh_96UFlf3dc693A-AW4xmGCP5sJq3M4IImVHcEMLQGZhgxkRFGKHnYIIQFpUkgl-Cq5TWpeR1jSfg482O3oXY2631GXZWRz_6T1hacD7oXR5_LVyUsNUbuMw6j8HruIdt8DmGDQwOdqMvrjIM_ZdOeezhcp-y3aZrcOH0JtmbU5yC96fHVftSdYvn13beVT2lNFeCk6FkjDvGB6MdckZQ2xgkDMPUCcz72iDChRRioEbixmosrTS1NAJRQqfg_rh3F8P3j01ZrcNP9OWkIo0kmDLMeFGho6qPIaVondrF8lXcK4zUAaEqCNUBoTohLJa7o2W01v7LZcMbziX9A-OIbOQ</recordid><startdate>20230401</startdate><enddate>20230401</enddate><creator>Pang, Bo</creator><creator>Jiang, Zhong-Ping</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TB</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-4359-2937</orcidid><orcidid>https://orcid.org/0000-0002-4868-9359</orcidid></search><sort><creationdate>20230401</creationdate><title>Reinforcement Learning for Adaptive Optimal Stationary Control of Linear Stochastic Systems</title><author>Pang, Bo ; Jiang, Zhong-Ping</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c333t-862dc3356f56dbaf0fb83e7b08b513f816c4b0268988d3b917ea19e9b49b80323</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Adaptive control</topic><topic>Adaptive optimal control</topic><topic>Algorithms</topic><topic>data-driven control</topic><topic>Heuristic algorithms</topic><topic>Least squares</topic><topic>Machine learning</topic><topic>Optimal control</topic><topic>Performance analysis</topic><topic>policy iteration</topic><topic>Process control</topic><topic>Reinforcement learning</topic><topic>robustness</topic><topic>stochastic control</topic><topic>Stochastic processes</topic><topic>Stochastic systems</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Pang, Bo</creatorcontrib><creatorcontrib>Jiang, Zhong-Ping</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on automatic control</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Pang, Bo</au><au>Jiang, Zhong-Ping</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Reinforcement Learning for Adaptive Optimal Stationary Control of Linear Stochastic Systems</atitle><jtitle>IEEE transactions on automatic control</jtitle><stitle>TAC</stitle><date>2023-04-01</date><risdate>2023</risdate><volume>68</volume><issue>4</issue><spage>2383</spage><epage>2390</epage><pages>2383-2390</pages><issn>0018-9286</issn><eissn>1558-2523</eissn><coden>IETAA9</coden><abstract>This article studies the adaptive optimal stationary control of continuous-time linear stochastic systems with both additive and multiplicative noises, using reinforcement learning techniques. Based on policy iteration, a novel off-policy reinforcement learning algorithm, named optimistic least-squares-based policy iteration, is proposed, which is able to find iteratively near-optimal policies of the adaptive optimal stationary control problem directly from input/state data without explicitly identifying any system matrices, starting from an initial admissible control policy. The solutions given by the proposed optimistic least-squares-based policy iteration are proved to converge to a small neighborhood of the optimal solution with probability one, under mild conditions. The application of the proposed algorithm to a triple inverted pendulum example validates its feasibility and effectiveness.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TAC.2022.3172250</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0002-4359-2937</orcidid><orcidid>https://orcid.org/0000-0002-4868-9359</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0018-9286 |
ispartof | IEEE transactions on automatic control, 2023-04, Vol.68 (4), p.2383-2390 |
issn | 0018-9286 1558-2523 |
language | eng |
recordid | cdi_proquest_journals_2792135156 |
source | IEEE Electronic Library (IEL) |
subjects | Adaptive control Adaptive optimal control Algorithms data-driven control Heuristic algorithms Least squares Machine learning Optimal control Performance analysis policy iteration Process control Reinforcement learning robustness stochastic control Stochastic processes Stochastic systems |
title | Reinforcement Learning for Adaptive Optimal Stationary Control of Linear Stochastic Systems |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T16%3A41%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Reinforcement%20Learning%20for%20Adaptive%20Optimal%20Stationary%20Control%20of%20Linear%20Stochastic%20Systems&rft.jtitle=IEEE%20transactions%20on%20automatic%20control&rft.au=Pang,%20Bo&rft.date=2023-04-01&rft.volume=68&rft.issue=4&rft.spage=2383&rft.epage=2390&rft.pages=2383-2390&rft.issn=0018-9286&rft.eissn=1558-2523&rft.coden=IETAA9&rft_id=info:doi/10.1109/TAC.2022.3172250&rft_dat=%3Cproquest_RIE%3E2792135156%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2792135156&rft_id=info:pmid/&rft_ieee_id=9767669&rfr_iscdi=true |