Output Feedback H∞ Control for Linear Discrete-Time Multi-Player Systems With Multi-Source Disturbances Using Off-Policy Q-Learning

In this paper, a data-driven optimal control method based on adaptive dynamic programming and game theory is presented for solving the output feedback solutions of the H ∞ control problem for linear discrete-time systems with multiple players subject to multi-source disturbances. We first transform...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE access 2020, Vol.8, p.208938-208951
Hauptverfasser:	Xiao, Zhenfei, Li, Jinna, Li, Ping
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptive control adaptive dynamic programming Algorithms Control methods Discrete time systems Disturbances Dynamic programming Game theory Games H-infinity control Heuristic algorithms H∞ control Machine learning Nash equilibrium Optimal control Output feedback Performance analysis reinforcement learning System dynamics
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	208951
container_issue
container_start_page	208938
container_title	IEEE access
container_volume	8
creator	Xiao, Zhenfei Li, Jinna Li, Ping
description	In this paper, a data-driven optimal control method based on adaptive dynamic programming and game theory is presented for solving the output feedback solutions of the H ∞ control problem for linear discrete-time systems with multiple players subject to multi-source disturbances. We first transform the H ∞ control problem into a multi-player game problem following the theoretical solutions according to game theory. Since the system state may not be measurable, we derive the output feedback based control policies and disturbances through mathematical operations. Considering the advantages of off-policy reinforcement learning (RL) over on-policy RL, a novel off-policy game Q-learning algorithm dealing with mixed competition and cooperation among players is developed, such that the H ∞ control problem can be finally solved for linear multi-player systems without the knowledge of system dynamics. Moreover, rigorous proofs of algorithm convergence and unbiasedness of solutions are presented. Finally, simulation results demonstrated the effectiveness of the proposed method.
doi_str_mv	10.1109/ACCESS.2020.3038674
format	Article
fullrecord	<record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_proquest_journals_2465667080</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9261350</ieee_id><doaj_id>oai_doaj_org_article_3e357ede12024911adea612ed20ecdab</doaj_id><sourcerecordid>2465667080</sourcerecordid><originalsourceid>FETCH-LOGICAL-c3234-c57fc0a3431c984c2d0703f2931c4bb8e203bfc52dfe15115b0e4e44888427763</originalsourceid><addsrcrecordid>eNpNUUtu2zAQFYIWSJDmBNkQ6Fou_5KWgZo0ARw4hRN0SVDUMKUriy5JLXyAAj1FDpeThK6MoNwM53Hemxm-orgkeEEIbr5cte31er2gmOIFw6yWFT8pziiRTckEkx_-u58WFzFucD51hkR1VvxZTWk3JXQD0Hfa_EK3r39fUOvHFPyArA9o6UbQAX110QRIUD66LaD7aUiufBj0HgJa72OCbUQ_XPp5fFn7KRg4kNIUOj0aiOgpuvEZrawtH_zgzB59L5dZeczop-Kj1UOEi2M8L55urh_b23K5-nbXXi1LwyjjpRGVNVgzzohpam5ojyvMLG1yzruuBopZZ42gvQUiCBEdBg6c13XNaVVJdl7czbq91xu1C26rw1557dQ_wIdnpUNyZgDFgIkKeiD5X3lDiO5BS0KhpxhMr7us9XnW2gX_e4KY1CYvPebxFeVSSFnhGucqNleZ4GMMYN-7EqwO9qnZPnWwTx3ty6zLmeUA4J3RUEmYwOwNMMqXCw</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2465667080</pqid></control><display><type>article</type><title>Output Feedback H∞ Control for Linear Discrete-Time Multi-Player Systems With Multi-Source Disturbances Using Off-Policy Q-Learning</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Xiao, Zhenfei ; Li, Jinna ; Li, Ping</creator><creatorcontrib>Xiao, Zhenfei ; Li, Jinna ; Li, Ping</creatorcontrib><description>In this paper, a data-driven optimal control method based on adaptive dynamic programming and game theory is presented for solving the output feedback solutions of the H ∞ control problem for linear discrete-time systems with multiple players subject to multi-source disturbances. We first transform the H ∞ control problem into a multi-player game problem following the theoretical solutions according to game theory. Since the system state may not be measurable, we derive the output feedback based control policies and disturbances through mathematical operations. Considering the advantages of off-policy reinforcement learning (RL) over on-policy RL, a novel off-policy game Q-learning algorithm dealing with mixed competition and cooperation among players is developed, such that the H ∞ control problem can be finally solved for linear multi-player systems without the knowledge of system dynamics. Moreover, rigorous proofs of algorithm convergence and unbiasedness of solutions are presented. Finally, simulation results demonstrated the effectiveness of the proposed method.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2020.3038674</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Adaptive control ; adaptive dynamic programming ; Algorithms ; Control methods ; Discrete time systems ; Disturbances ; Dynamic programming ; Game theory ; Games ; H-infinity control ; Heuristic algorithms ; H∞ control ; Machine learning ; Nash equilibrium ; Optimal control ; Output feedback ; Performance analysis ; reinforcement learning ; System dynamics</subject><ispartof>IEEE access, 2020, Vol.8, p.208938-208951</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c3234-c57fc0a3431c984c2d0703f2931c4bb8e203bfc52dfe15115b0e4e44888427763</citedby><cites>FETCH-LOGICAL-c3234-c57fc0a3431c984c2d0703f2931c4bb8e203bfc52dfe15115b0e4e44888427763</cites><orcidid>0000-0002-3216-6246 ; 0000-0001-9985-6308</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9261350$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,860,2096,4010,27610,27900,27901,27902,54908</link.rule.ids></links><search><creatorcontrib>Xiao, Zhenfei</creatorcontrib><creatorcontrib>Li, Jinna</creatorcontrib><creatorcontrib>Li, Ping</creatorcontrib><title>Output Feedback H∞ Control for Linear Discrete-Time Multi-Player Systems With Multi-Source Disturbances Using Off-Policy Q-Learning</title><title>IEEE access</title><addtitle>Access</addtitle><description>In this paper, a data-driven optimal control method based on adaptive dynamic programming and game theory is presented for solving the output feedback solutions of the H ∞ control problem for linear discrete-time systems with multiple players subject to multi-source disturbances. We first transform the H ∞ control problem into a multi-player game problem following the theoretical solutions according to game theory. Since the system state may not be measurable, we derive the output feedback based control policies and disturbances through mathematical operations. Considering the advantages of off-policy reinforcement learning (RL) over on-policy RL, a novel off-policy game Q-learning algorithm dealing with mixed competition and cooperation among players is developed, such that the H ∞ control problem can be finally solved for linear multi-player systems without the knowledge of system dynamics. Moreover, rigorous proofs of algorithm convergence and unbiasedness of solutions are presented. Finally, simulation results demonstrated the effectiveness of the proposed method.</description><subject>Adaptive control</subject><subject>adaptive dynamic programming</subject><subject>Algorithms</subject><subject>Control methods</subject><subject>Discrete time systems</subject><subject>Disturbances</subject><subject>Dynamic programming</subject><subject>Game theory</subject><subject>Games</subject><subject>H-infinity control</subject><subject>Heuristic algorithms</subject><subject>H∞ control</subject><subject>Machine learning</subject><subject>Nash equilibrium</subject><subject>Optimal control</subject><subject>Output feedback</subject><subject>Performance analysis</subject><subject>reinforcement learning</subject><subject>System dynamics</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUUtu2zAQFYIWSJDmBNkQ6Fou_5KWgZo0ARw4hRN0SVDUMKUriy5JLXyAAj1FDpeThK6MoNwM53Hemxm-orgkeEEIbr5cte31er2gmOIFw6yWFT8pziiRTckEkx_-u58WFzFucD51hkR1VvxZTWk3JXQD0Hfa_EK3r39fUOvHFPyArA9o6UbQAX110QRIUD66LaD7aUiufBj0HgJa72OCbUQ_XPp5fFn7KRg4kNIUOj0aiOgpuvEZrawtH_zgzB59L5dZeczop-Kj1UOEi2M8L55urh_b23K5-nbXXi1LwyjjpRGVNVgzzohpam5ojyvMLG1yzruuBopZZ42gvQUiCBEdBg6c13XNaVVJdl7czbq91xu1C26rw1557dQ_wIdnpUNyZgDFgIkKeiD5X3lDiO5BS0KhpxhMr7us9XnW2gX_e4KY1CYvPebxFeVSSFnhGucqNleZ4GMMYN-7EqwO9qnZPnWwTx3ty6zLmeUA4J3RUEmYwOwNMMqXCw</recordid><startdate>2020</startdate><enddate>2020</enddate><creator>Xiao, Zhenfei</creator><creator>Li, Jinna</creator><creator>Li, Ping</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-3216-6246</orcidid><orcidid>https://orcid.org/0000-0001-9985-6308</orcidid></search><sort><creationdate>2020</creationdate><title>Output Feedback H∞ Control for Linear Discrete-Time Multi-Player Systems With Multi-Source Disturbances Using Off-Policy Q-Learning</title><author>Xiao, Zhenfei ; Li, Jinna ; Li, Ping</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c3234-c57fc0a3431c984c2d0703f2931c4bb8e203bfc52dfe15115b0e4e44888427763</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Adaptive control</topic><topic>adaptive dynamic programming</topic><topic>Algorithms</topic><topic>Control methods</topic><topic>Discrete time systems</topic><topic>Disturbances</topic><topic>Dynamic programming</topic><topic>Game theory</topic><topic>Games</topic><topic>H-infinity control</topic><topic>Heuristic algorithms</topic><topic>H∞ control</topic><topic>Machine learning</topic><topic>Nash equilibrium</topic><topic>Optimal control</topic><topic>Output feedback</topic><topic>Performance analysis</topic><topic>reinforcement learning</topic><topic>System dynamics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Xiao, Zhenfei</creatorcontrib><creatorcontrib>Li, Jinna</creatorcontrib><creatorcontrib>Li, Ping</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Xiao, Zhenfei</au><au>Li, Jinna</au><au>Li, Ping</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Output Feedback H∞ Control for Linear Discrete-Time Multi-Player Systems With Multi-Source Disturbances Using Off-Policy Q-Learning</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2020</date><risdate>2020</risdate><volume>8</volume><spage>208938</spage><epage>208951</epage><pages>208938-208951</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>In this paper, a data-driven optimal control method based on adaptive dynamic programming and game theory is presented for solving the output feedback solutions of the H ∞ control problem for linear discrete-time systems with multiple players subject to multi-source disturbances. We first transform the H ∞ control problem into a multi-player game problem following the theoretical solutions according to game theory. Since the system state may not be measurable, we derive the output feedback based control policies and disturbances through mathematical operations. Considering the advantages of off-policy reinforcement learning (RL) over on-policy RL, a novel off-policy game Q-learning algorithm dealing with mixed competition and cooperation among players is developed, such that the H ∞ control problem can be finally solved for linear multi-player systems without the knowledge of system dynamics. Moreover, rigorous proofs of algorithm convergence and unbiasedness of solutions are presented. Finally, simulation results demonstrated the effectiveness of the proposed method.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2020.3038674</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-3216-6246</orcidid><orcidid>https://orcid.org/0000-0001-9985-6308</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2169-3536
ispartof	IEEE access, 2020, Vol.8, p.208938-208951
issn	2169-3536 2169-3536
language	eng
recordid	cdi_proquest_journals_2465667080
source	IEEE Open Access Journals; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals
subjects	Adaptive control adaptive dynamic programming Algorithms Control methods Discrete time systems Disturbances Dynamic programming Game theory Games H-infinity control Heuristic algorithms H∞ control Machine learning Nash equilibrium Optimal control Output feedback Performance analysis reinforcement learning System dynamics
title	Output Feedback H∞ Control for Linear Discrete-Time Multi-Player Systems With Multi-Source Disturbances Using Off-Policy Q-Learning
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-06T19%3A27%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Output%20Feedback%20H%E2%88%9E%20Control%20for%20Linear%20Discrete-Time%20Multi-Player%20Systems%20With%20Multi-Source%20Disturbances%20Using%20Off-Policy%20Q-Learning&rft.jtitle=IEEE%20access&rft.au=Xiao,%20Zhenfei&rft.date=2020&rft.volume=8&rft.spage=208938&rft.epage=208951&rft.pages=208938-208951&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2020.3038674&rft_dat=%3Cproquest_ieee_%3E2465667080%3C/proquest_ieee_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2465667080&rft_id=info:pmid/&rft_ieee_id=9261350&rft_doaj_id=oai_doaj_org_article_3e357ede12024911adea612ed20ecdab&rfr_iscdi=true