Online Policy Learning-Based Output-Feedback Optimal Control of Continuous-Time Systems

Although state-feedback optimal control of the continuous-time (CT) systems has been extensively studied, resolving optimal control online via output-feedback is still challenging, especially only input-output information can be used. In this brief, we develop an innovative technique to online desig...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on circuits and systems. II, Express briefs Express briefs, 2024-02, Vol.71 (2), p.652-656
Hauptverfasser: Zhao, Jun, Lv, Yongfeng, Zeng, Qingliang, Wan, Lirong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 656
container_issue 2
container_start_page 652
container_title IEEE transactions on circuits and systems. II, Express briefs
container_volume 71
creator Zhao, Jun
Lv, Yongfeng
Zeng, Qingliang
Wan, Lirong
description Although state-feedback optimal control of the continuous-time (CT) systems has been extensively studied, resolving optimal control online via output-feedback is still challenging, especially only input-output information can be used. In this brief, we develop an innovative technique to online design the output-feedback optimal control (OFOC) of the CT systems. Firstly, to synthesis the OFOC, an output-feedback algebraic Riccati equation (OARE) is constructed, which can be solved using input-output information. Then, an online policy learning (PL) algorithm is developed to compute the solution of the OARE, where only the input-output information is required and the conventional offline learning procedure is avoided. Simulations based on an aircraft model are provided to test the developed control method and online learning algorithm.
doi_str_mv 10.1109/TCSII.2022.3211832
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_9913499</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9913499</ieee_id><sourcerecordid>2923124611</sourcerecordid><originalsourceid>FETCH-LOGICAL-c295t-d322e4d41c4edf821ea02ee6a17f37665854f97d5a033ced138b47a3a4ee0f323</originalsourceid><addsrcrecordid>eNo9kMtOwzAQRS0EEqXwA7CJxNrFHjtxvISIQqVKQWoRS8tNJigljYOdLPr3pA-xmru4Z0ZzCLnnbMY500_rbLVYzIABzARwngq4IBMexykVSvPLQ5aaKiXVNbkJYcsYaCZgQr7ytqlbjD5cUxf7aInWt3X7TV9swDLKh74bejpHLDe2-Inyrq93toky1_beNZGrjrFuBzcEuq53GK32ocdduCVXlW0C3p3nlHzOX9fZO13mb4vseUkL0HFPSwGAspS8kFhWKXC0DBATy1UlVJLEaSwrrcrYMiEKLLlIN1JZYSUiqwSIKXk87e28-x0w9GbrBt-OJw1oEBxkwvnYglOr8C4Ej5Xp_PiI3xvOzEGgOQo0B4HmLHCEHk5QjYj_gNZcSK3FH14abJs</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2923124611</pqid></control><display><type>article</type><title>Online Policy Learning-Based Output-Feedback Optimal Control of Continuous-Time Systems</title><source>IEEE Electronic Library (IEL)</source><creator>Zhao, Jun ; Lv, Yongfeng ; Zeng, Qingliang ; Wan, Lirong</creator><creatorcontrib>Zhao, Jun ; Lv, Yongfeng ; Zeng, Qingliang ; Wan, Lirong</creatorcontrib><description>Although state-feedback optimal control of the continuous-time (CT) systems has been extensively studied, resolving optimal control online via output-feedback is still challenging, especially only input-output information can be used. In this brief, we develop an innovative technique to online design the output-feedback optimal control (OFOC) of the CT systems. Firstly, to synthesis the OFOC, an output-feedback algebraic Riccati equation (OARE) is constructed, which can be solved using input-output information. Then, an online policy learning (PL) algorithm is developed to compute the solution of the OARE, where only the input-output information is required and the conventional offline learning procedure is avoided. Simulations based on an aircraft model are provided to test the developed control method and online learning algorithm.</description><identifier>ISSN: 1549-7747</identifier><identifier>EISSN: 1558-3791</identifier><identifier>DOI: 10.1109/TCSII.2022.3211832</identifier><identifier>CODEN: ITCSFK</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Aircraft models ; Algorithms ; Atmospheric modeling ; Continuous time systems ; Control methods ; Control systems ; Convergence ; Heuristic algorithms ; Machine learning ; Mathematical models ; Observers ; Optimal control ; Output feedback ; Output-feedback control ; policy learning ; Riccati equation ; Riccati equations ; State feedback</subject><ispartof>IEEE transactions on circuits and systems. II, Express briefs, 2024-02, Vol.71 (2), p.652-656</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c295t-d322e4d41c4edf821ea02ee6a17f37665854f97d5a033ced138b47a3a4ee0f323</citedby><cites>FETCH-LOGICAL-c295t-d322e4d41c4edf821ea02ee6a17f37665854f97d5a033ced138b47a3a4ee0f323</cites><orcidid>0000-0002-9139-7220 ; 0000-0003-2908-2583 ; 0000-0002-3842-9107</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9913499$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,778,782,794,27911,27912,54745</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9913499$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Zhao, Jun</creatorcontrib><creatorcontrib>Lv, Yongfeng</creatorcontrib><creatorcontrib>Zeng, Qingliang</creatorcontrib><creatorcontrib>Wan, Lirong</creatorcontrib><title>Online Policy Learning-Based Output-Feedback Optimal Control of Continuous-Time Systems</title><title>IEEE transactions on circuits and systems. II, Express briefs</title><addtitle>TCSII</addtitle><description>Although state-feedback optimal control of the continuous-time (CT) systems has been extensively studied, resolving optimal control online via output-feedback is still challenging, especially only input-output information can be used. In this brief, we develop an innovative technique to online design the output-feedback optimal control (OFOC) of the CT systems. Firstly, to synthesis the OFOC, an output-feedback algebraic Riccati equation (OARE) is constructed, which can be solved using input-output information. Then, an online policy learning (PL) algorithm is developed to compute the solution of the OARE, where only the input-output information is required and the conventional offline learning procedure is avoided. Simulations based on an aircraft model are provided to test the developed control method and online learning algorithm.</description><subject>Aircraft models</subject><subject>Algorithms</subject><subject>Atmospheric modeling</subject><subject>Continuous time systems</subject><subject>Control methods</subject><subject>Control systems</subject><subject>Convergence</subject><subject>Heuristic algorithms</subject><subject>Machine learning</subject><subject>Mathematical models</subject><subject>Observers</subject><subject>Optimal control</subject><subject>Output feedback</subject><subject>Output-feedback control</subject><subject>policy learning</subject><subject>Riccati equation</subject><subject>Riccati equations</subject><subject>State feedback</subject><issn>1549-7747</issn><issn>1558-3791</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kMtOwzAQRS0EEqXwA7CJxNrFHjtxvISIQqVKQWoRS8tNJigljYOdLPr3pA-xmru4Z0ZzCLnnbMY500_rbLVYzIABzARwngq4IBMexykVSvPLQ5aaKiXVNbkJYcsYaCZgQr7ytqlbjD5cUxf7aInWt3X7TV9swDLKh74bejpHLDe2-Inyrq93toky1_beNZGrjrFuBzcEuq53GK32ocdduCVXlW0C3p3nlHzOX9fZO13mb4vseUkL0HFPSwGAspS8kFhWKXC0DBATy1UlVJLEaSwrrcrYMiEKLLlIN1JZYSUiqwSIKXk87e28-x0w9GbrBt-OJw1oEBxkwvnYglOr8C4Ej5Xp_PiI3xvOzEGgOQo0B4HmLHCEHk5QjYj_gNZcSK3FH14abJs</recordid><startdate>20240201</startdate><enddate>20240201</enddate><creator>Zhao, Jun</creator><creator>Lv, Yongfeng</creator><creator>Zeng, Qingliang</creator><creator>Wan, Lirong</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0002-9139-7220</orcidid><orcidid>https://orcid.org/0000-0003-2908-2583</orcidid><orcidid>https://orcid.org/0000-0002-3842-9107</orcidid></search><sort><creationdate>20240201</creationdate><title>Online Policy Learning-Based Output-Feedback Optimal Control of Continuous-Time Systems</title><author>Zhao, Jun ; Lv, Yongfeng ; Zeng, Qingliang ; Wan, Lirong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c295t-d322e4d41c4edf821ea02ee6a17f37665854f97d5a033ced138b47a3a4ee0f323</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Aircraft models</topic><topic>Algorithms</topic><topic>Atmospheric modeling</topic><topic>Continuous time systems</topic><topic>Control methods</topic><topic>Control systems</topic><topic>Convergence</topic><topic>Heuristic algorithms</topic><topic>Machine learning</topic><topic>Mathematical models</topic><topic>Observers</topic><topic>Optimal control</topic><topic>Output feedback</topic><topic>Output-feedback control</topic><topic>policy learning</topic><topic>Riccati equation</topic><topic>Riccati equations</topic><topic>State feedback</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhao, Jun</creatorcontrib><creatorcontrib>Lv, Yongfeng</creatorcontrib><creatorcontrib>Zeng, Qingliang</creatorcontrib><creatorcontrib>Wan, Lirong</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on circuits and systems. II, Express briefs</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhao, Jun</au><au>Lv, Yongfeng</au><au>Zeng, Qingliang</au><au>Wan, Lirong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Online Policy Learning-Based Output-Feedback Optimal Control of Continuous-Time Systems</atitle><jtitle>IEEE transactions on circuits and systems. II, Express briefs</jtitle><stitle>TCSII</stitle><date>2024-02-01</date><risdate>2024</risdate><volume>71</volume><issue>2</issue><spage>652</spage><epage>656</epage><pages>652-656</pages><issn>1549-7747</issn><eissn>1558-3791</eissn><coden>ITCSFK</coden><abstract>Although state-feedback optimal control of the continuous-time (CT) systems has been extensively studied, resolving optimal control online via output-feedback is still challenging, especially only input-output information can be used. In this brief, we develop an innovative technique to online design the output-feedback optimal control (OFOC) of the CT systems. Firstly, to synthesis the OFOC, an output-feedback algebraic Riccati equation (OARE) is constructed, which can be solved using input-output information. Then, an online policy learning (PL) algorithm is developed to compute the solution of the OARE, where only the input-output information is required and the conventional offline learning procedure is avoided. Simulations based on an aircraft model are provided to test the developed control method and online learning algorithm.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TCSII.2022.3211832</doi><tpages>5</tpages><orcidid>https://orcid.org/0000-0002-9139-7220</orcidid><orcidid>https://orcid.org/0000-0003-2908-2583</orcidid><orcidid>https://orcid.org/0000-0002-3842-9107</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1549-7747
ispartof IEEE transactions on circuits and systems. II, Express briefs, 2024-02, Vol.71 (2), p.652-656
issn 1549-7747
1558-3791
language eng
recordid cdi_ieee_primary_9913499
source IEEE Electronic Library (IEL)
subjects Aircraft models
Algorithms
Atmospheric modeling
Continuous time systems
Control methods
Control systems
Convergence
Heuristic algorithms
Machine learning
Mathematical models
Observers
Optimal control
Output feedback
Output-feedback control
policy learning
Riccati equation
Riccati equations
State feedback
title Online Policy Learning-Based Output-Feedback Optimal Control of Continuous-Time Systems
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T15%3A44%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Online%20Policy%20Learning-Based%20Output-Feedback%20Optimal%20Control%20of%20Continuous-Time%20Systems&rft.jtitle=IEEE%20transactions%20on%20circuits%20and%20systems.%20II,%20Express%20briefs&rft.au=Zhao,%20Jun&rft.date=2024-02-01&rft.volume=71&rft.issue=2&rft.spage=652&rft.epage=656&rft.pages=652-656&rft.issn=1549-7747&rft.eissn=1558-3791&rft.coden=ITCSFK&rft_id=info:doi/10.1109/TCSII.2022.3211832&rft_dat=%3Cproquest_RIE%3E2923124611%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2923124611&rft_id=info:pmid/&rft_ieee_id=9913499&rfr_iscdi=true