Online solution of nonquadratic two-player zero-sum games arising in the H ∞  control of constrained input systems

SUMMARYIn this paper, we present an online learning algorithm to find the solution to the H ∞  control problem of continuous‐time systems with input constraints. A suitable nonquadratic functional is utilized to encode the input constraints into the H ∞  control problem, and the related H ∞  control...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of adaptive control and signal processing 2014-03, Vol.28 (3-5), p.232-254
Hauptverfasser: Modares, Hamidreza, Lewis, Frank L., Sistani, Mohammad-Bagher Naghibi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 254
container_issue 3-5
container_start_page 232
container_title International journal of adaptive control and signal processing
container_volume 28
creator Modares, Hamidreza
Lewis, Frank L.
Sistani, Mohammad-Bagher Naghibi
description SUMMARYIn this paper, we present an online learning algorithm to find the solution to the H ∞  control problem of continuous‐time systems with input constraints. A suitable nonquadratic functional is utilized to encode the input constraints into the H ∞  control problem, and the related H ∞  control problem is formulated as a two‐player zero‐sum game with a nonquadratic performance. Then, a policy iteration algorithm on an actor–critic–disturbance structure is developed to solve the Hamilton–Jacobi–Isaacs (HJI) equation associated with this nonquadratic zero‐sum game. That is, three NN approximators, namely, actor, critic, and disturbance, are tuned online and simultaneously for approximating the HJI solution. The value of the actor and disturbance policies is approximated continuously by the critic NN, and then on the basis of this value estimate, the actor and disturbance NNs are updated in real time to improve their policies. The disturbance tries to make the worst possible disturbance, whereas the actor tries to make the best control input. A persistence of excitation condition is shown to guarantee convergence to the optimal saddle point solution. Stability of the closed‐loop system is also guaranteed. A simulation on a nonlinear benchmark problem is performed to validate the effectiveness of the proposed approach. Copyright © 2012 John Wiley & Sons, Ltd.
doi_str_mv 10.1002/acs.2348
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_1508331087</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3249454941</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2468-c427c7ed8440cddea6cf6b14b3902f30700b230db13aa4ea94d57346e9386b9c3</originalsourceid><addsrcrecordid>eNp10M1KAzEQB_AgCtYP8BECXrysTjbpfhxL0Sp-ISoeQzabranbpE2yaD151Jfw4XwSUyqCBy-TgfyYYf4I7RE4JADpkZD-MKWsWEM9AmWZEEL666gHRQlJRtN8E215PwGIf4T20Mu1abVR2Nu2C9oabBtsrJl3onYiaInDs01mrVgoh1-Vs4nvpngspspj4bTXZoy1weFR4dOvt4-v989YsbQmONsuZ8XWByfiijrCWRewX_igpn4HbTSi9Wr3591G9yfHd8PT5OJ6dDYcXCQyZVmRSJbmMld1wRjIulYik01WEVbREtKGQg5QpRTqilAhmBIlq_s5ZZkqaZFVpaTbaH81d-bsvFM-8IntnIkrOelDQSmBIo_qYKWks9471fCZ01PhFpwAX-bKY658mWukyYo-61Yt_nV8MLz963W8--XXC_fEs5zmff5wNeJwc04fhuklL-g3WrOMhQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1508331087</pqid></control><display><type>article</type><title>Online solution of nonquadratic two-player zero-sum games arising in the H ∞  control of constrained input systems</title><source>Wiley Online Library Journals Frontfile Complete</source><creator>Modares, Hamidreza ; Lewis, Frank L. ; Sistani, Mohammad-Bagher Naghibi</creator><creatorcontrib>Modares, Hamidreza ; Lewis, Frank L. ; Sistani, Mohammad-Bagher Naghibi</creatorcontrib><description>SUMMARYIn this paper, we present an online learning algorithm to find the solution to the H ∞  control problem of continuous‐time systems with input constraints. A suitable nonquadratic functional is utilized to encode the input constraints into the H ∞  control problem, and the related H ∞  control problem is formulated as a two‐player zero‐sum game with a nonquadratic performance. Then, a policy iteration algorithm on an actor–critic–disturbance structure is developed to solve the Hamilton–Jacobi–Isaacs (HJI) equation associated with this nonquadratic zero‐sum game. That is, three NN approximators, namely, actor, critic, and disturbance, are tuned online and simultaneously for approximating the HJI solution. The value of the actor and disturbance policies is approximated continuously by the critic NN, and then on the basis of this value estimate, the actor and disturbance NNs are updated in real time to improve their policies. The disturbance tries to make the worst possible disturbance, whereas the actor tries to make the best control input. A persistence of excitation condition is shown to guarantee convergence to the optimal saddle point solution. Stability of the closed‐loop system is also guaranteed. A simulation on a nonlinear benchmark problem is performed to validate the effectiveness of the proposed approach. Copyright © 2012 John Wiley &amp; Sons, Ltd.</description><identifier>ISSN: 0890-6327</identifier><identifier>EISSN: 1099-1115</identifier><identifier>DOI: 10.1002/acs.2348</identifier><language>eng</language><publisher>Bognor Regis: Blackwell Publishing Ltd</publisher><subject>H ∞  control ; input constraints ; neural networks ; policy iteration ; two-player zero-sum games</subject><ispartof>International journal of adaptive control and signal processing, 2014-03, Vol.28 (3-5), p.232-254</ispartof><rights>Copyright © 2012 John Wiley &amp; Sons, Ltd.</rights><rights>Copyright © 2014 John Wiley &amp; Sons, Ltd.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c2468-c427c7ed8440cddea6cf6b14b3902f30700b230db13aa4ea94d57346e9386b9c3</citedby><cites>FETCH-LOGICAL-c2468-c427c7ed8440cddea6cf6b14b3902f30700b230db13aa4ea94d57346e9386b9c3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1002%2Facs.2348$$EPDF$$P50$$Gwiley$$H</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1002%2Facs.2348$$EHTML$$P50$$Gwiley$$H</linktohtml><link.rule.ids>314,777,781,1412,27905,27906,45555,45556</link.rule.ids></links><search><creatorcontrib>Modares, Hamidreza</creatorcontrib><creatorcontrib>Lewis, Frank L.</creatorcontrib><creatorcontrib>Sistani, Mohammad-Bagher Naghibi</creatorcontrib><title>Online solution of nonquadratic two-player zero-sum games arising in the H ∞  control of constrained input systems</title><title>International journal of adaptive control and signal processing</title><addtitle>Int. J. Adapt. Control Signal Process</addtitle><description>SUMMARYIn this paper, we present an online learning algorithm to find the solution to the H ∞  control problem of continuous‐time systems with input constraints. A suitable nonquadratic functional is utilized to encode the input constraints into the H ∞  control problem, and the related H ∞  control problem is formulated as a two‐player zero‐sum game with a nonquadratic performance. Then, a policy iteration algorithm on an actor–critic–disturbance structure is developed to solve the Hamilton–Jacobi–Isaacs (HJI) equation associated with this nonquadratic zero‐sum game. That is, three NN approximators, namely, actor, critic, and disturbance, are tuned online and simultaneously for approximating the HJI solution. The value of the actor and disturbance policies is approximated continuously by the critic NN, and then on the basis of this value estimate, the actor and disturbance NNs are updated in real time to improve their policies. The disturbance tries to make the worst possible disturbance, whereas the actor tries to make the best control input. A persistence of excitation condition is shown to guarantee convergence to the optimal saddle point solution. Stability of the closed‐loop system is also guaranteed. A simulation on a nonlinear benchmark problem is performed to validate the effectiveness of the proposed approach. Copyright © 2012 John Wiley &amp; Sons, Ltd.</description><subject>H ∞  control</subject><subject>input constraints</subject><subject>neural networks</subject><subject>policy iteration</subject><subject>two-player zero-sum games</subject><issn>0890-6327</issn><issn>1099-1115</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2014</creationdate><recordtype>article</recordtype><recordid>eNp10M1KAzEQB_AgCtYP8BECXrysTjbpfhxL0Sp-ISoeQzabranbpE2yaD151Jfw4XwSUyqCBy-TgfyYYf4I7RE4JADpkZD-MKWsWEM9AmWZEEL666gHRQlJRtN8E215PwGIf4T20Mu1abVR2Nu2C9oabBtsrJl3onYiaInDs01mrVgoh1-Vs4nvpngspspj4bTXZoy1weFR4dOvt4-v989YsbQmONsuZ8XWByfiijrCWRewX_igpn4HbTSi9Wr3591G9yfHd8PT5OJ6dDYcXCQyZVmRSJbmMld1wRjIulYik01WEVbREtKGQg5QpRTqilAhmBIlq_s5ZZkqaZFVpaTbaH81d-bsvFM-8IntnIkrOelDQSmBIo_qYKWks9471fCZ01PhFpwAX-bKY658mWukyYo-61Yt_nV8MLz963W8--XXC_fEs5zmff5wNeJwc04fhuklL-g3WrOMhQ</recordid><startdate>201403</startdate><enddate>201403</enddate><creator>Modares, Hamidreza</creator><creator>Lewis, Frank L.</creator><creator>Sistani, Mohammad-Bagher Naghibi</creator><general>Blackwell Publishing Ltd</general><general>Wiley Subscription Services, Inc</general><scope>BSCLL</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>201403</creationdate><title>Online solution of nonquadratic two-player zero-sum games arising in the H ∞  control of constrained input systems</title><author>Modares, Hamidreza ; Lewis, Frank L. ; Sistani, Mohammad-Bagher Naghibi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2468-c427c7ed8440cddea6cf6b14b3902f30700b230db13aa4ea94d57346e9386b9c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2014</creationdate><topic>H ∞  control</topic><topic>input constraints</topic><topic>neural networks</topic><topic>policy iteration</topic><topic>two-player zero-sum games</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Modares, Hamidreza</creatorcontrib><creatorcontrib>Lewis, Frank L.</creatorcontrib><creatorcontrib>Sistani, Mohammad-Bagher Naghibi</creatorcontrib><collection>Istex</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>International journal of adaptive control and signal processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Modares, Hamidreza</au><au>Lewis, Frank L.</au><au>Sistani, Mohammad-Bagher Naghibi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Online solution of nonquadratic two-player zero-sum games arising in the H ∞  control of constrained input systems</atitle><jtitle>International journal of adaptive control and signal processing</jtitle><addtitle>Int. J. Adapt. Control Signal Process</addtitle><date>2014-03</date><risdate>2014</risdate><volume>28</volume><issue>3-5</issue><spage>232</spage><epage>254</epage><pages>232-254</pages><issn>0890-6327</issn><eissn>1099-1115</eissn><abstract>SUMMARYIn this paper, we present an online learning algorithm to find the solution to the H ∞  control problem of continuous‐time systems with input constraints. A suitable nonquadratic functional is utilized to encode the input constraints into the H ∞  control problem, and the related H ∞  control problem is formulated as a two‐player zero‐sum game with a nonquadratic performance. Then, a policy iteration algorithm on an actor–critic–disturbance structure is developed to solve the Hamilton–Jacobi–Isaacs (HJI) equation associated with this nonquadratic zero‐sum game. That is, three NN approximators, namely, actor, critic, and disturbance, are tuned online and simultaneously for approximating the HJI solution. The value of the actor and disturbance policies is approximated continuously by the critic NN, and then on the basis of this value estimate, the actor and disturbance NNs are updated in real time to improve their policies. The disturbance tries to make the worst possible disturbance, whereas the actor tries to make the best control input. A persistence of excitation condition is shown to guarantee convergence to the optimal saddle point solution. Stability of the closed‐loop system is also guaranteed. A simulation on a nonlinear benchmark problem is performed to validate the effectiveness of the proposed approach. Copyright © 2012 John Wiley &amp; Sons, Ltd.</abstract><cop>Bognor Regis</cop><pub>Blackwell Publishing Ltd</pub><doi>10.1002/acs.2348</doi><tpages>23</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0890-6327
ispartof International journal of adaptive control and signal processing, 2014-03, Vol.28 (3-5), p.232-254
issn 0890-6327
1099-1115
language eng
recordid cdi_proquest_journals_1508331087
source Wiley Online Library Journals Frontfile Complete
subjects H ∞  control
input constraints
neural networks
policy iteration
two-player zero-sum games
title Online solution of nonquadratic two-player zero-sum games arising in the H ∞  control of constrained input systems
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T06%3A02%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Online%20solution%20of%20nonquadratic%20two-player%20zero-sum%20games%20arising%20in%20the%20H%E2%80%89%E2%88%9E%E2%80%89%20control%20of%20constrained%20input%20systems&rft.jtitle=International%20journal%20of%20adaptive%20control%20and%20signal%20processing&rft.au=Modares,%20Hamidreza&rft.date=2014-03&rft.volume=28&rft.issue=3-5&rft.spage=232&rft.epage=254&rft.pages=232-254&rft.issn=0890-6327&rft.eissn=1099-1115&rft_id=info:doi/10.1002/acs.2348&rft_dat=%3Cproquest_cross%3E3249454941%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1508331087&rft_id=info:pmid/&rfr_iscdi=true