Online inverse reinforcement learning for nonlinear systems with adversarial attacks

In the inverse reinforcement learning (RL) problem, there are two agents. A learner agent seeks to mimic another expert agent's state and control input behavior trajectories by observing the expert's behavior trajectories. These observations are used to reconstruct the unknown expert'...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International journal of robust and nonlinear control 2021-09, Vol.31 (14), p.6646-6667
Hauptverfasser:	Lian, Bosen, Xue, Wenqian, Lewis, Frank L., Chai, Tianyou
Format:	Artikel
Sprache:	eng
Schlagworte:	adaptive control Algorithms Dynamical systems integral reinforcement learning inverse optimal control inverse reinforcement learning Machine learning Neural networks Nonlinear dynamics Nonlinear systems Optimal control System dynamics Trajectory control
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	6667
container_issue	14
container_start_page	6646
container_title	International journal of robust and nonlinear control
container_volume	31
creator	Lian, Bosen Xue, Wenqian Lewis, Frank L. Chai, Tianyou
description	In the inverse reinforcement learning (RL) problem, there are two agents. A learner agent seeks to mimic another expert agent's state and control input behavior trajectories by observing the expert's behavior trajectories. These observations are used to reconstruct the unknown expert's performance objective. This article develops novel inverse RL algorithms to solve the inverse RL problem in which both agents suffer from adversarial attacks and have continuous‐time nonlinear dynamics. We first propose an offline inverse RL algorithm for the learner to reconstruct unknown expert's performance objective. This offline inverse RL algorithm is based on the technique of integral RL (IRL) and only needs partial knowledge of the system dynamics. The algorithm has two learning stages: an optimal control learning stage first and a second learning stage based on inverse optimal control. Then, based on the offline algorithm, an online inverse RL algorithm is further developed to solve the inverse RL problem in real time without knowing the system drift dynamics. This online adaptive learning method consists of simultaneous adaptation of four neural networks (NNs): a critic NN, an actor NN, an adversary NN, and a state penalty NN. Convergence of the algorithms as well as the stability of the learner system and the synchronous tuning NNs are guaranteed. Simulation examples verify the effectiveness of the online method.
doi_str_mv	10.1002/rnc.5626
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2559629301</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2559629301</sourcerecordid><originalsourceid>FETCH-LOGICAL-c3276-e1513a511446d97fb21861bad96dad0dcffbffb0e59c5b65e5db70aa42fd4fce3</originalsourceid><addsrcrecordid>eNp1kE1LAzEQQIMoWKvgTwh48bI1yW7S5ijFLygWpJ5DNplo6jZbk61l_73Z1qswMMPMmxl4CF1TMqGEsLsYzIQLJk7QiBIpC8pKeTrUlSxmkpXn6CKlNSF5xqoRWi1D4wNgH34gJsARfHBtNLCB0OEGdAw-fODcwqE9oDri1KcONgnvffeJtR02dfS6wbrrtPlKl-jM6SbB1V8eo_fHh9X8uVgsn17m94vClGwqCqCclppTWlXCyqmrGZ0JWmsrhdWWWONcnYMAl4bXggO39ZRoXTFnK2egHKOb491tbL93kDq1bncx5JeKcS4FkyWhmbo9Uia2KUVwahv9RsdeUaIGZyo7U4OzjBZHdO8b6P_l1Nvr_MD_Att5b74</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2559629301</pqid></control><display><type>article</type><title>Online inverse reinforcement learning for nonlinear systems with adversarial attacks</title><source>Wiley Online Library All Journals</source><creator>Lian, Bosen ; Xue, Wenqian ; Lewis, Frank L. ; Chai, Tianyou</creator><creatorcontrib>Lian, Bosen ; Xue, Wenqian ; Lewis, Frank L. ; Chai, Tianyou</creatorcontrib><description>In the inverse reinforcement learning (RL) problem, there are two agents. A learner agent seeks to mimic another expert agent's state and control input behavior trajectories by observing the expert's behavior trajectories. These observations are used to reconstruct the unknown expert's performance objective. This article develops novel inverse RL algorithms to solve the inverse RL problem in which both agents suffer from adversarial attacks and have continuous‐time nonlinear dynamics. We first propose an offline inverse RL algorithm for the learner to reconstruct unknown expert's performance objective. This offline inverse RL algorithm is based on the technique of integral RL (IRL) and only needs partial knowledge of the system dynamics. The algorithm has two learning stages: an optimal control learning stage first and a second learning stage based on inverse optimal control. Then, based on the offline algorithm, an online inverse RL algorithm is further developed to solve the inverse RL problem in real time without knowing the system drift dynamics. This online adaptive learning method consists of simultaneous adaptation of four neural networks (NNs): a critic NN, an actor NN, an adversary NN, and a state penalty NN. Convergence of the algorithms as well as the stability of the learner system and the synchronous tuning NNs are guaranteed. Simulation examples verify the effectiveness of the online method.</description><identifier>ISSN: 1049-8923</identifier><identifier>EISSN: 1099-1239</identifier><identifier>DOI: 10.1002/rnc.5626</identifier><language>eng</language><publisher>Bognor Regis: Wiley Subscription Services, Inc</publisher><subject>adaptive control ; Algorithms ; Dynamical systems ; integral reinforcement learning ; inverse optimal control ; inverse reinforcement learning ; Machine learning ; Neural networks ; Nonlinear dynamics ; Nonlinear systems ; Optimal control ; System dynamics ; Trajectory control</subject><ispartof>International journal of robust and nonlinear control, 2021-09, Vol.31 (14), p.6646-6667</ispartof><rights>2021 John Wiley & Sons Ltd.</rights><rights>2021 John Wiley & Sons, Ltd.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c3276-e1513a511446d97fb21861bad96dad0dcffbffb0e59c5b65e5db70aa42fd4fce3</citedby><cites>FETCH-LOGICAL-c3276-e1513a511446d97fb21861bad96dad0dcffbffb0e59c5b65e5db70aa42fd4fce3</cites><orcidid>0000-0002-3275-9551</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1002%2Frnc.5626$$EPDF$$P50$$Gwiley$$H</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1002%2Frnc.5626$$EHTML$$P50$$Gwiley$$H</linktohtml><link.rule.ids>314,780,784,1416,27922,27923,45572,45573</link.rule.ids></links><search><creatorcontrib>Lian, Bosen</creatorcontrib><creatorcontrib>Xue, Wenqian</creatorcontrib><creatorcontrib>Lewis, Frank L.</creatorcontrib><creatorcontrib>Chai, Tianyou</creatorcontrib><title>Online inverse reinforcement learning for nonlinear systems with adversarial attacks</title><title>International journal of robust and nonlinear control</title><description>In the inverse reinforcement learning (RL) problem, there are two agents. A learner agent seeks to mimic another expert agent's state and control input behavior trajectories by observing the expert's behavior trajectories. These observations are used to reconstruct the unknown expert's performance objective. This article develops novel inverse RL algorithms to solve the inverse RL problem in which both agents suffer from adversarial attacks and have continuous‐time nonlinear dynamics. We first propose an offline inverse RL algorithm for the learner to reconstruct unknown expert's performance objective. This offline inverse RL algorithm is based on the technique of integral RL (IRL) and only needs partial knowledge of the system dynamics. The algorithm has two learning stages: an optimal control learning stage first and a second learning stage based on inverse optimal control. Then, based on the offline algorithm, an online inverse RL algorithm is further developed to solve the inverse RL problem in real time without knowing the system drift dynamics. This online adaptive learning method consists of simultaneous adaptation of four neural networks (NNs): a critic NN, an actor NN, an adversary NN, and a state penalty NN. Convergence of the algorithms as well as the stability of the learner system and the synchronous tuning NNs are guaranteed. Simulation examples verify the effectiveness of the online method.</description><subject>adaptive control</subject><subject>Algorithms</subject><subject>Dynamical systems</subject><subject>integral reinforcement learning</subject><subject>inverse optimal control</subject><subject>inverse reinforcement learning</subject><subject>Machine learning</subject><subject>Neural networks</subject><subject>Nonlinear dynamics</subject><subject>Nonlinear systems</subject><subject>Optimal control</subject><subject>System dynamics</subject><subject>Trajectory control</subject><issn>1049-8923</issn><issn>1099-1239</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp1kE1LAzEQQIMoWKvgTwh48bI1yW7S5ijFLygWpJ5DNplo6jZbk61l_73Z1qswMMPMmxl4CF1TMqGEsLsYzIQLJk7QiBIpC8pKeTrUlSxmkpXn6CKlNSF5xqoRWi1D4wNgH34gJsARfHBtNLCB0OEGdAw-fODcwqE9oDri1KcONgnvffeJtR02dfS6wbrrtPlKl-jM6SbB1V8eo_fHh9X8uVgsn17m94vClGwqCqCclppTWlXCyqmrGZ0JWmsrhdWWWONcnYMAl4bXggO39ZRoXTFnK2egHKOb491tbL93kDq1bncx5JeKcS4FkyWhmbo9Uia2KUVwahv9RsdeUaIGZyo7U4OzjBZHdO8b6P_l1Nvr_MD_Att5b74</recordid><startdate>20210925</startdate><enddate>20210925</enddate><creator>Lian, Bosen</creator><creator>Xue, Wenqian</creator><creator>Lewis, Frank L.</creator><creator>Chai, Tianyou</creator><general>Wiley Subscription Services, Inc</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TB</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-3275-9551</orcidid></search><sort><creationdate>20210925</creationdate><title>Online inverse reinforcement learning for nonlinear systems with adversarial attacks</title><author>Lian, Bosen ; Xue, Wenqian ; Lewis, Frank L. ; Chai, Tianyou</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c3276-e1513a511446d97fb21861bad96dad0dcffbffb0e59c5b65e5db70aa42fd4fce3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>adaptive control</topic><topic>Algorithms</topic><topic>Dynamical systems</topic><topic>integral reinforcement learning</topic><topic>inverse optimal control</topic><topic>inverse reinforcement learning</topic><topic>Machine learning</topic><topic>Neural networks</topic><topic>Nonlinear dynamics</topic><topic>Nonlinear systems</topic><topic>Optimal control</topic><topic>System dynamics</topic><topic>Trajectory control</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lian, Bosen</creatorcontrib><creatorcontrib>Xue, Wenqian</creatorcontrib><creatorcontrib>Lewis, Frank L.</creatorcontrib><creatorcontrib>Chai, Tianyou</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>International journal of robust and nonlinear control</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lian, Bosen</au><au>Xue, Wenqian</au><au>Lewis, Frank L.</au><au>Chai, Tianyou</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Online inverse reinforcement learning for nonlinear systems with adversarial attacks</atitle><jtitle>International journal of robust and nonlinear control</jtitle><date>2021-09-25</date><risdate>2021</risdate><volume>31</volume><issue>14</issue><spage>6646</spage><epage>6667</epage><pages>6646-6667</pages><issn>1049-8923</issn><eissn>1099-1239</eissn><abstract>In the inverse reinforcement learning (RL) problem, there are two agents. A learner agent seeks to mimic another expert agent's state and control input behavior trajectories by observing the expert's behavior trajectories. These observations are used to reconstruct the unknown expert's performance objective. This article develops novel inverse RL algorithms to solve the inverse RL problem in which both agents suffer from adversarial attacks and have continuous‐time nonlinear dynamics. We first propose an offline inverse RL algorithm for the learner to reconstruct unknown expert's performance objective. This offline inverse RL algorithm is based on the technique of integral RL (IRL) and only needs partial knowledge of the system dynamics. The algorithm has two learning stages: an optimal control learning stage first and a second learning stage based on inverse optimal control. Then, based on the offline algorithm, an online inverse RL algorithm is further developed to solve the inverse RL problem in real time without knowing the system drift dynamics. This online adaptive learning method consists of simultaneous adaptation of four neural networks (NNs): a critic NN, an actor NN, an adversary NN, and a state penalty NN. Convergence of the algorithms as well as the stability of the learner system and the synchronous tuning NNs are guaranteed. Simulation examples verify the effectiveness of the online method.</abstract><cop>Bognor Regis</cop><pub>Wiley Subscription Services, Inc</pub><doi>10.1002/rnc.5626</doi><tpages>22</tpages><orcidid>https://orcid.org/0000-0002-3275-9551</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1049-8923
ispartof	International journal of robust and nonlinear control, 2021-09, Vol.31 (14), p.6646-6667
issn	1049-8923 1099-1239
language	eng
recordid	cdi_proquest_journals_2559629301
source	Wiley Online Library All Journals
subjects	adaptive control Algorithms Dynamical systems integral reinforcement learning inverse optimal control inverse reinforcement learning Machine learning Neural networks Nonlinear dynamics Nonlinear systems Optimal control System dynamics Trajectory control
title	Online inverse reinforcement learning for nonlinear systems with adversarial attacks
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-09T18%3A57%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Online%20inverse%20reinforcement%20learning%20for%20nonlinear%20systems%20with%20adversarial%20attacks&rft.jtitle=International%20journal%20of%20robust%20and%20nonlinear%20control&rft.au=Lian,%20Bosen&rft.date=2021-09-25&rft.volume=31&rft.issue=14&rft.spage=6646&rft.epage=6667&rft.pages=6646-6667&rft.issn=1049-8923&rft.eissn=1099-1239&rft_id=info:doi/10.1002/rnc.5626&rft_dat=%3Cproquest_cross%3E2559629301%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2559629301&rft_id=info:pmid/&rfr_iscdi=true