Reinforcement Learning for Adaptive Optimal Stationary Control of Linear Stochastic Systems

This article studies the adaptive optimal stationary control of continuous-time linear stochastic systems with both additive and multiplicative noises, using reinforcement learning techniques. Based on policy iteration, a novel off-policy reinforcement learning algorithm, named optimistic least-squa...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on automatic control 2023-04, Vol.68 (4), p.2383-2390
Hauptverfasser: Pang, Bo, Jiang, Zhong-Ping
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2390
container_issue 4
container_start_page 2383
container_title IEEE transactions on automatic control
container_volume 68
creator Pang, Bo
Jiang, Zhong-Ping
description This article studies the adaptive optimal stationary control of continuous-time linear stochastic systems with both additive and multiplicative noises, using reinforcement learning techniques. Based on policy iteration, a novel off-policy reinforcement learning algorithm, named optimistic least-squares-based policy iteration, is proposed, which is able to find iteratively near-optimal policies of the adaptive optimal stationary control problem directly from input/state data without explicitly identifying any system matrices, starting from an initial admissible control policy. The solutions given by the proposed optimistic least-squares-based policy iteration are proved to converge to a small neighborhood of the optimal solution with probability one, under mild conditions. The application of the proposed algorithm to a triple inverted pendulum example validates its feasibility and effectiveness.
doi_str_mv 10.1109/TAC.2022.3172250
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2792135156</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9767669</ieee_id><sourcerecordid>2792135156</sourcerecordid><originalsourceid>FETCH-LOGICAL-c333t-862dc3356f56dbaf0fb83e7b08b513f816c4b0268988d3b917ea19e9b49b80323</originalsourceid><addsrcrecordid>eNo9kEtLAzEUhYMoWKt7wU3A9dQ8JplkWQZfMFCwdeUiJDOJTmmTmkSh_96UFlf3dc693A-AW4xmGCP5sJq3M4IImVHcEMLQGZhgxkRFGKHnYIIQFpUkgl-Cq5TWpeR1jSfg482O3oXY2631GXZWRz_6T1hacD7oXR5_LVyUsNUbuMw6j8HruIdt8DmGDQwOdqMvrjIM_ZdOeezhcp-y3aZrcOH0JtmbU5yC96fHVftSdYvn13beVT2lNFeCk6FkjDvGB6MdckZQ2xgkDMPUCcz72iDChRRioEbixmosrTS1NAJRQqfg_rh3F8P3j01ZrcNP9OWkIo0kmDLMeFGho6qPIaVondrF8lXcK4zUAaEqCNUBoTohLJa7o2W01v7LZcMbziX9A-OIbOQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2792135156</pqid></control><display><type>article</type><title>Reinforcement Learning for Adaptive Optimal Stationary Control of Linear Stochastic Systems</title><source>IEEE Electronic Library (IEL)</source><creator>Pang, Bo ; Jiang, Zhong-Ping</creator><creatorcontrib>Pang, Bo ; Jiang, Zhong-Ping</creatorcontrib><description>This article studies the adaptive optimal stationary control of continuous-time linear stochastic systems with both additive and multiplicative noises, using reinforcement learning techniques. Based on policy iteration, a novel off-policy reinforcement learning algorithm, named optimistic least-squares-based policy iteration, is proposed, which is able to find iteratively near-optimal policies of the adaptive optimal stationary control problem directly from input/state data without explicitly identifying any system matrices, starting from an initial admissible control policy. The solutions given by the proposed optimistic least-squares-based policy iteration are proved to converge to a small neighborhood of the optimal solution with probability one, under mild conditions. The application of the proposed algorithm to a triple inverted pendulum example validates its feasibility and effectiveness.</description><identifier>ISSN: 0018-9286</identifier><identifier>EISSN: 1558-2523</identifier><identifier>DOI: 10.1109/TAC.2022.3172250</identifier><identifier>CODEN: IETAA9</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Adaptive control ; Adaptive  optimal  control ; Algorithms ; data-driven  control ; Heuristic algorithms ; Least squares ; Machine learning ; Optimal control ; Performance analysis ; policy iteration ; Process control ; Reinforcement learning ; robustness ; stochastic control ; Stochastic processes ; Stochastic systems</subject><ispartof>IEEE transactions on automatic control, 2023-04, Vol.68 (4), p.2383-2390</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c333t-862dc3356f56dbaf0fb83e7b08b513f816c4b0268988d3b917ea19e9b49b80323</citedby><cites>FETCH-LOGICAL-c333t-862dc3356f56dbaf0fb83e7b08b513f816c4b0268988d3b917ea19e9b49b80323</cites><orcidid>0000-0002-4359-2937 ; 0000-0002-4868-9359</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9767669$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9767669$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Pang, Bo</creatorcontrib><creatorcontrib>Jiang, Zhong-Ping</creatorcontrib><title>Reinforcement Learning for Adaptive Optimal Stationary Control of Linear Stochastic Systems</title><title>IEEE transactions on automatic control</title><addtitle>TAC</addtitle><description>This article studies the adaptive optimal stationary control of continuous-time linear stochastic systems with both additive and multiplicative noises, using reinforcement learning techniques. Based on policy iteration, a novel off-policy reinforcement learning algorithm, named optimistic least-squares-based policy iteration, is proposed, which is able to find iteratively near-optimal policies of the adaptive optimal stationary control problem directly from input/state data without explicitly identifying any system matrices, starting from an initial admissible control policy. The solutions given by the proposed optimistic least-squares-based policy iteration are proved to converge to a small neighborhood of the optimal solution with probability one, under mild conditions. The application of the proposed algorithm to a triple inverted pendulum example validates its feasibility and effectiveness.</description><subject>Adaptive control</subject><subject>Adaptive  optimal  control</subject><subject>Algorithms</subject><subject>data-driven  control</subject><subject>Heuristic algorithms</subject><subject>Least squares</subject><subject>Machine learning</subject><subject>Optimal control</subject><subject>Performance analysis</subject><subject>policy iteration</subject><subject>Process control</subject><subject>Reinforcement learning</subject><subject>robustness</subject><subject>stochastic control</subject><subject>Stochastic processes</subject><subject>Stochastic systems</subject><issn>0018-9286</issn><issn>1558-2523</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kEtLAzEUhYMoWKt7wU3A9dQ8JplkWQZfMFCwdeUiJDOJTmmTmkSh_96UFlf3dc693A-AW4xmGCP5sJq3M4IImVHcEMLQGZhgxkRFGKHnYIIQFpUkgl-Cq5TWpeR1jSfg482O3oXY2631GXZWRz_6T1hacD7oXR5_LVyUsNUbuMw6j8HruIdt8DmGDQwOdqMvrjIM_ZdOeezhcp-y3aZrcOH0JtmbU5yC96fHVftSdYvn13beVT2lNFeCk6FkjDvGB6MdckZQ2xgkDMPUCcz72iDChRRioEbixmosrTS1NAJRQqfg_rh3F8P3j01ZrcNP9OWkIo0kmDLMeFGho6qPIaVondrF8lXcK4zUAaEqCNUBoTohLJa7o2W01v7LZcMbziX9A-OIbOQ</recordid><startdate>20230401</startdate><enddate>20230401</enddate><creator>Pang, Bo</creator><creator>Jiang, Zhong-Ping</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TB</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-4359-2937</orcidid><orcidid>https://orcid.org/0000-0002-4868-9359</orcidid></search><sort><creationdate>20230401</creationdate><title>Reinforcement Learning for Adaptive Optimal Stationary Control of Linear Stochastic Systems</title><author>Pang, Bo ; Jiang, Zhong-Ping</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c333t-862dc3356f56dbaf0fb83e7b08b513f816c4b0268988d3b917ea19e9b49b80323</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Adaptive control</topic><topic>Adaptive  optimal  control</topic><topic>Algorithms</topic><topic>data-driven  control</topic><topic>Heuristic algorithms</topic><topic>Least squares</topic><topic>Machine learning</topic><topic>Optimal control</topic><topic>Performance analysis</topic><topic>policy iteration</topic><topic>Process control</topic><topic>Reinforcement learning</topic><topic>robustness</topic><topic>stochastic control</topic><topic>Stochastic processes</topic><topic>Stochastic systems</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Pang, Bo</creatorcontrib><creatorcontrib>Jiang, Zhong-Ping</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on automatic control</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Pang, Bo</au><au>Jiang, Zhong-Ping</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Reinforcement Learning for Adaptive Optimal Stationary Control of Linear Stochastic Systems</atitle><jtitle>IEEE transactions on automatic control</jtitle><stitle>TAC</stitle><date>2023-04-01</date><risdate>2023</risdate><volume>68</volume><issue>4</issue><spage>2383</spage><epage>2390</epage><pages>2383-2390</pages><issn>0018-9286</issn><eissn>1558-2523</eissn><coden>IETAA9</coden><abstract>This article studies the adaptive optimal stationary control of continuous-time linear stochastic systems with both additive and multiplicative noises, using reinforcement learning techniques. Based on policy iteration, a novel off-policy reinforcement learning algorithm, named optimistic least-squares-based policy iteration, is proposed, which is able to find iteratively near-optimal policies of the adaptive optimal stationary control problem directly from input/state data without explicitly identifying any system matrices, starting from an initial admissible control policy. The solutions given by the proposed optimistic least-squares-based policy iteration are proved to converge to a small neighborhood of the optimal solution with probability one, under mild conditions. The application of the proposed algorithm to a triple inverted pendulum example validates its feasibility and effectiveness.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TAC.2022.3172250</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0002-4359-2937</orcidid><orcidid>https://orcid.org/0000-0002-4868-9359</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0018-9286
ispartof IEEE transactions on automatic control, 2023-04, Vol.68 (4), p.2383-2390
issn 0018-9286
1558-2523
language eng
recordid cdi_proquest_journals_2792135156
source IEEE Electronic Library (IEL)
subjects Adaptive control
Adaptive  optimal  control
Algorithms
data-driven  control
Heuristic algorithms
Least squares
Machine learning
Optimal control
Performance analysis
policy iteration
Process control
Reinforcement learning
robustness
stochastic control
Stochastic processes
Stochastic systems
title Reinforcement Learning for Adaptive Optimal Stationary Control of Linear Stochastic Systems
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T16%3A41%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Reinforcement%20Learning%20for%20Adaptive%20Optimal%20Stationary%20Control%20of%20Linear%20Stochastic%20Systems&rft.jtitle=IEEE%20transactions%20on%20automatic%20control&rft.au=Pang,%20Bo&rft.date=2023-04-01&rft.volume=68&rft.issue=4&rft.spage=2383&rft.epage=2390&rft.pages=2383-2390&rft.issn=0018-9286&rft.eissn=1558-2523&rft.coden=IETAA9&rft_id=info:doi/10.1109/TAC.2022.3172250&rft_dat=%3Cproquest_RIE%3E2792135156%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2792135156&rft_id=info:pmid/&rft_ieee_id=9767669&rfr_iscdi=true