Online Policy Learning-Based Output-Feedback Optimal Control of Continuous-Time Systems

Although state-feedback optimal control of the continuous-time (CT) systems has been extensively studied, resolving optimal control online via output-feedback is still challenging, especially only input-output information can be used. In this brief, we develop an innovative technique to online desig...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on circuits and systems. II, Express briefs Express briefs, 2024-02, Vol.71 (2), p.652-656
Hauptverfasser:	Zhao, Jun, Lv, Yongfeng, Zeng, Qingliang, Wan, Lirong
Format:	Artikel
Sprache:	eng
Schlagworte:	Aircraft models Algorithms Atmospheric modeling Continuous time systems Control methods Control systems Convergence Heuristic algorithms Machine learning Mathematical models Observers Optimal control Output feedback Output-feedback control policy learning Riccati equation Riccati equations State feedback
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	656
container_issue	2
container_start_page	652
container_title	IEEE transactions on circuits and systems. II, Express briefs
container_volume	71
creator	Zhao, Jun Lv, Yongfeng Zeng, Qingliang Wan, Lirong
description	Although state-feedback optimal control of the continuous-time (CT) systems has been extensively studied, resolving optimal control online via output-feedback is still challenging, especially only input-output information can be used. In this brief, we develop an innovative technique to online design the output-feedback optimal control (OFOC) of the CT systems. Firstly, to synthesis the OFOC, an output-feedback algebraic Riccati equation (OARE) is constructed, which can be solved using input-output information. Then, an online policy learning (PL) algorithm is developed to compute the solution of the OARE, where only the input-output information is required and the conventional offline learning procedure is avoided. Simulations based on an aircraft model are provided to test the developed control method and online learning algorithm.
doi_str_mv	10.1109/TCSII.2022.3211832
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_9913499</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9913499</ieee_id><sourcerecordid>2923124611</sourcerecordid><originalsourceid>FETCH-LOGICAL-c295t-d322e4d41c4edf821ea02ee6a17f37665854f97d5a033ced138b47a3a4ee0f323</originalsourceid><addsrcrecordid>eNo9kMtOwzAQRS0EEqXwA7CJxNrFHjtxvISIQqVKQWoRS8tNJigljYOdLPr3pA-xmru4Z0ZzCLnnbMY500_rbLVYzIABzARwngq4IBMexykVSvPLQ5aaKiXVNbkJYcsYaCZgQr7ytqlbjD5cUxf7aInWt3X7TV9swDLKh74bejpHLDe2-Inyrq93toky1_beNZGrjrFuBzcEuq53GK32ocdduCVXlW0C3p3nlHzOX9fZO13mb4vseUkL0HFPSwGAspS8kFhWKXC0DBATy1UlVJLEaSwrrcrYMiEKLLlIN1JZYSUiqwSIKXk87e28-x0w9GbrBt-OJw1oEBxkwvnYglOr8C4Ej5Xp_PiI3xvOzEGgOQo0B4HmLHCEHk5QjYj_gNZcSK3FH14abJs</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2923124611</pqid></control><display><type>article</type><title>Online Policy Learning-Based Output-Feedback Optimal Control of Continuous-Time Systems</title><source>IEEE Electronic Library (IEL)</source><creator>Zhao, Jun ; Lv, Yongfeng ; Zeng, Qingliang ; Wan, Lirong</creator><creatorcontrib>Zhao, Jun ; Lv, Yongfeng ; Zeng, Qingliang ; Wan, Lirong</creatorcontrib><description>Although state-feedback optimal control of the continuous-time (CT) systems has been extensively studied, resolving optimal control online via output-feedback is still challenging, especially only input-output information can be used. In this brief, we develop an innovative technique to online design the output-feedback optimal control (OFOC) of the CT systems. Firstly, to synthesis the OFOC, an output-feedback algebraic Riccati equation (OARE) is constructed, which can be solved using input-output information. Then, an online policy learning (PL) algorithm is developed to compute the solution of the OARE, where only the input-output information is required and the conventional offline learning procedure is avoided. Simulations based on an aircraft model are provided to test the developed control method and online learning algorithm.</description><identifier>ISSN: 1549-7747</identifier><identifier>EISSN: 1558-3791</identifier><identifier>DOI: 10.1109/TCSII.2022.3211832</identifier><identifier>CODEN: ITCSFK</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Aircraft models ; Algorithms ; Atmospheric modeling ; Continuous time systems ; Control methods ; Control systems ; Convergence ; Heuristic algorithms ; Machine learning ; Mathematical models ; Observers ; Optimal control ; Output feedback ; Output-feedback control ; policy learning ; Riccati equation ; Riccati equations ; State feedback</subject><ispartof>IEEE transactions on circuits and systems. II, Express briefs, 2024-02, Vol.71 (2), p.652-656</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c295t-d322e4d41c4edf821ea02ee6a17f37665854f97d5a033ced138b47a3a4ee0f323</citedby><cites>FETCH-LOGICAL-c295t-d322e4d41c4edf821ea02ee6a17f37665854f97d5a033ced138b47a3a4ee0f323</cites><orcidid>0000-0002-9139-7220 ; 0000-0003-2908-2583 ; 0000-0002-3842-9107</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9913499$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,778,782,794,27911,27912,54745</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9913499$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Zhao, Jun</creatorcontrib><creatorcontrib>Lv, Yongfeng</creatorcontrib><creatorcontrib>Zeng, Qingliang</creatorcontrib><creatorcontrib>Wan, Lirong</creatorcontrib><title>Online Policy Learning-Based Output-Feedback Optimal Control of Continuous-Time Systems</title><title>IEEE transactions on circuits and systems. II, Express briefs</title><addtitle>TCSII</addtitle><description>Although state-feedback optimal control of the continuous-time (CT) systems has been extensively studied, resolving optimal control online via output-feedback is still challenging, especially only input-output information can be used. In this brief, we develop an innovative technique to online design the output-feedback optimal control (OFOC) of the CT systems. Firstly, to synthesis the OFOC, an output-feedback algebraic Riccati equation (OARE) is constructed, which can be solved using input-output information. Then, an online policy learning (PL) algorithm is developed to compute the solution of the OARE, where only the input-output information is required and the conventional offline learning procedure is avoided. Simulations based on an aircraft model are provided to test the developed control method and online learning algorithm.</description><subject>Aircraft models</subject><subject>Algorithms</subject><subject>Atmospheric modeling</subject><subject>Continuous time systems</subject><subject>Control methods</subject><subject>Control systems</subject><subject>Convergence</subject><subject>Heuristic algorithms</subject><subject>Machine learning</subject><subject>Mathematical models</subject><subject>Observers</subject><subject>Optimal control</subject><subject>Output feedback</subject><subject>Output-feedback control</subject><subject>policy learning</subject><subject>Riccati equation</subject><subject>Riccati equations</subject><subject>State feedback</subject><issn>1549-7747</issn><issn>1558-3791</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kMtOwzAQRS0EEqXwA7CJxNrFHjtxvISIQqVKQWoRS8tNJigljYOdLPr3pA-xmru4Z0ZzCLnnbMY500_rbLVYzIABzARwngq4IBMexykVSvPLQ5aaKiXVNbkJYcsYaCZgQr7ytqlbjD5cUxf7aInWt3X7TV9swDLKh74bejpHLDe2-Inyrq93toky1_beNZGrjrFuBzcEuq53GK32ocdduCVXlW0C3p3nlHzOX9fZO13mb4vseUkL0HFPSwGAspS8kFhWKXC0DBATy1UlVJLEaSwrrcrYMiEKLLlIN1JZYSUiqwSIKXk87e28-x0w9GbrBt-OJw1oEBxkwvnYglOr8C4Ej5Xp_PiI3xvOzEGgOQo0B4HmLHCEHk5QjYj_gNZcSK3FH14abJs</recordid><startdate>20240201</startdate><enddate>20240201</enddate><creator>Zhao, Jun</creator><creator>Lv, Yongfeng</creator><creator>Zeng, Qingliang</creator><creator>Wan, Lirong</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0002-9139-7220</orcidid><orcidid>https://orcid.org/0000-0003-2908-2583</orcidid><orcidid>https://orcid.org/0000-0002-3842-9107</orcidid></search><sort><creationdate>20240201</creationdate><title>Online Policy Learning-Based Output-Feedback Optimal Control of Continuous-Time Systems</title><author>Zhao, Jun ; Lv, Yongfeng ; Zeng, Qingliang ; Wan, Lirong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c295t-d322e4d41c4edf821ea02ee6a17f37665854f97d5a033ced138b47a3a4ee0f323</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Aircraft models</topic><topic>Algorithms</topic><topic>Atmospheric modeling</topic><topic>Continuous time systems</topic><topic>Control methods</topic><topic>Control systems</topic><topic>Convergence</topic><topic>Heuristic algorithms</topic><topic>Machine learning</topic><topic>Mathematical models</topic><topic>Observers</topic><topic>Optimal control</topic><topic>Output feedback</topic><topic>Output-feedback control</topic><topic>policy learning</topic><topic>Riccati equation</topic><topic>Riccati equations</topic><topic>State feedback</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhao, Jun</creatorcontrib><creatorcontrib>Lv, Yongfeng</creatorcontrib><creatorcontrib>Zeng, Qingliang</creatorcontrib><creatorcontrib>Wan, Lirong</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on circuits and systems. II, Express briefs</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhao, Jun</au><au>Lv, Yongfeng</au><au>Zeng, Qingliang</au><au>Wan, Lirong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Online Policy Learning-Based Output-Feedback Optimal Control of Continuous-Time Systems</atitle><jtitle>IEEE transactions on circuits and systems. II, Express briefs</jtitle><stitle>TCSII</stitle><date>2024-02-01</date><risdate>2024</risdate><volume>71</volume><issue>2</issue><spage>652</spage><epage>656</epage><pages>652-656</pages><issn>1549-7747</issn><eissn>1558-3791</eissn><coden>ITCSFK</coden><abstract>Although state-feedback optimal control of the continuous-time (CT) systems has been extensively studied, resolving optimal control online via output-feedback is still challenging, especially only input-output information can be used. In this brief, we develop an innovative technique to online design the output-feedback optimal control (OFOC) of the CT systems. Firstly, to synthesis the OFOC, an output-feedback algebraic Riccati equation (OARE) is constructed, which can be solved using input-output information. Then, an online policy learning (PL) algorithm is developed to compute the solution of the OARE, where only the input-output information is required and the conventional offline learning procedure is avoided. Simulations based on an aircraft model are provided to test the developed control method and online learning algorithm.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TCSII.2022.3211832</doi><tpages>5</tpages><orcidid>https://orcid.org/0000-0002-9139-7220</orcidid><orcidid>https://orcid.org/0000-0003-2908-2583</orcidid><orcidid>https://orcid.org/0000-0002-3842-9107</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1549-7747
ispartof	IEEE transactions on circuits and systems. II, Express briefs, 2024-02, Vol.71 (2), p.652-656
issn	1549-7747 1558-3791
language	eng
recordid	cdi_ieee_primary_9913499
source	IEEE Electronic Library (IEL)
subjects	Aircraft models Algorithms Atmospheric modeling Continuous time systems Control methods Control systems Convergence Heuristic algorithms Machine learning Mathematical models Observers Optimal control Output feedback Output-feedback control policy learning Riccati equation Riccati equations State feedback
title	Online Policy Learning-Based Output-Feedback Optimal Control of Continuous-Time Systems
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T15%3A44%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Online%20Policy%20Learning-Based%20Output-Feedback%20Optimal%20Control%20of%20Continuous-Time%20Systems&rft.jtitle=IEEE%20transactions%20on%20circuits%20and%20systems.%20II,%20Express%20briefs&rft.au=Zhao,%20Jun&rft.date=2024-02-01&rft.volume=71&rft.issue=2&rft.spage=652&rft.epage=656&rft.pages=652-656&rft.issn=1549-7747&rft.eissn=1558-3791&rft.coden=ITCSFK&rft_id=info:doi/10.1109/TCSII.2022.3211832&rft_dat=%3Cproquest_RIE%3E2923124611%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2923124611&rft_id=info:pmid/&rft_ieee_id=9913499&rfr_iscdi=true