Reinforcement Learning based Output-Feedback Control of Nonlinear Nonstrict Feedback Discrete-time Systems with Application to Engines

A novel reinforcement-learning based output-adaptive neural network (NN) controller, also referred as the adaptive-critic NN controller, is developed to track a desired trajectory for a class of complex nonlinear discrete-time systems in the presence of bounded and unknown disturbances. The controll...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Shih, P., Vance, J., Kaul, B., Jagannathan, S., Drallmeier, J.A.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Control systems Engines Learning Neural networks Neurofeedback Nonlinear control systems Observers Output feedback State estimation Trajectory
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	5111
container_issue
container_start_page	5106
container_title
container_volume
creator	Shih, P. Vance, J. Kaul, B. Jagannathan, S. Drallmeier, J.A.
description	A novel reinforcement-learning based output-adaptive neural network (NN) controller, also referred as the adaptive-critic NN controller, is developed to track a desired trajectory for a class of complex nonlinear discrete-time systems in the presence of bounded and unknown disturbances. The controller includes an observer for estimating states and the outputs, critic, and two action NNs for generating virtual, and actual control inputs. The critic approximates certain strategic utility function and the action NNs are used to minimize both the strategic utility function and their outputs. All NN weights adapt online towards minimization of a performance index, utilizing gradient-descent based rule. A Lyapunov function proves the uniformly ultimate boundedness (UUB) of the closed-loop tracking error, weight, and observer estimation. Separation principle and certainty equivalence principles are relaxed; persistency of excitation condition and linear in the unknown parameter assumption is not needed. The performance of this adaptive critic NN controller is evaluated through simulation with the Daw engine model in lean mode. The objective is to reduce the cyclic dispersion in heat release by using the controller.
doi_str_mv	10.1109/ACC.2007.4283127
format	Conference Proceeding
fullrecord	<record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_4283127</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>4283127</ieee_id><sourcerecordid>4283127</sourcerecordid><originalsourceid>FETCH-LOGICAL-i90t-7fa28a36f5d6a63b64f5698db5e5fd9da23ade85d5825bd8970da1de48a6d4183</originalsourceid><addsrcrecordid>eNo9kMtOwzAURM1Loi3dI7HxD6T4HXtZhRaQKipB95UT3xRD4kSxK9Qf4LsBUVjNSHN0FoPQNSUzSom5nRfFjBGSzwTTnLL8BI2pYEIQo406RSPGc51JregZmppc_21anKMRyQXPqKLmEo1jfCOEGqPICH0-gw91N1TQQkh4BXYIPuxwaSM4vN6nfp-yJYArbfWOiy6koWtwV-OnLjQ-fOM_LabBVwn_c3c-VgMkyJJvAb8cYoI24g-fXvG87xtf2eS7gFOHF2H3bYlX6KK2TYTpMSdos1xsiodstb5_LOarzBuSsry2TFuuaumUVbxUopbKaFdKkLUzzjJuHWjppGaydNrkxFnqQGirnKCaT9DNr9YDwLYffGuHw_Z4J_8CXBdn2A</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Reinforcement Learning based Output-Feedback Control of Nonlinear Nonstrict Feedback Discrete-time Systems with Application to Engines</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Shih, P. ; Vance, J. ; Kaul, B. ; Jagannathan, S. ; Drallmeier, J.A.</creator><creatorcontrib>Shih, P. ; Vance, J. ; Kaul, B. ; Jagannathan, S. ; Drallmeier, J.A.</creatorcontrib><description>A novel reinforcement-learning based output-adaptive neural network (NN) controller, also referred as the adaptive-critic NN controller, is developed to track a desired trajectory for a class of complex nonlinear discrete-time systems in the presence of bounded and unknown disturbances. The controller includes an observer for estimating states and the outputs, critic, and two action NNs for generating virtual, and actual control inputs. The critic approximates certain strategic utility function and the action NNs are used to minimize both the strategic utility function and their outputs. All NN weights adapt online towards minimization of a performance index, utilizing gradient-descent based rule. A Lyapunov function proves the uniformly ultimate boundedness (UUB) of the closed-loop tracking error, weight, and observer estimation. Separation principle and certainty equivalence principles are relaxed; persistency of excitation condition and linear in the unknown parameter assumption is not needed. The performance of this adaptive critic NN controller is evaluated through simulation with the Daw engine model in lean mode. The objective is to reduce the cyclic dispersion in heat release by using the controller.</description><identifier>ISSN: 0743-1619</identifier><identifier>ISBN: 9781424409884</identifier><identifier>ISBN: 1424409888</identifier><identifier>EISSN: 2378-5861</identifier><identifier>EISBN: 1424409896</identifier><identifier>EISBN: 9781424409891</identifier><identifier>DOI: 10.1109/ACC.2007.4283127</identifier><language>eng</language><publisher>IEEE</publisher><subject>Control systems ; Engines ; Learning ; Neural networks ; Neurofeedback ; Nonlinear control systems ; Observers ; Output feedback ; State estimation ; Trajectory</subject><ispartof>2007 American Control Conference, 2007, p.5106-5111</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/4283127$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>310,311,781,785,790,791,2059,27930,54925</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/4283127$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Shih, P.</creatorcontrib><creatorcontrib>Vance, J.</creatorcontrib><creatorcontrib>Kaul, B.</creatorcontrib><creatorcontrib>Jagannathan, S.</creatorcontrib><creatorcontrib>Drallmeier, J.A.</creatorcontrib><title>Reinforcement Learning based Output-Feedback Control of Nonlinear Nonstrict Feedback Discrete-time Systems with Application to Engines</title><title>2007 American Control Conference</title><addtitle>ACC</addtitle><description>A novel reinforcement-learning based output-adaptive neural network (NN) controller, also referred as the adaptive-critic NN controller, is developed to track a desired trajectory for a class of complex nonlinear discrete-time systems in the presence of bounded and unknown disturbances. The controller includes an observer for estimating states and the outputs, critic, and two action NNs for generating virtual, and actual control inputs. The critic approximates certain strategic utility function and the action NNs are used to minimize both the strategic utility function and their outputs. All NN weights adapt online towards minimization of a performance index, utilizing gradient-descent based rule. A Lyapunov function proves the uniformly ultimate boundedness (UUB) of the closed-loop tracking error, weight, and observer estimation. Separation principle and certainty equivalence principles are relaxed; persistency of excitation condition and linear in the unknown parameter assumption is not needed. The performance of this adaptive critic NN controller is evaluated through simulation with the Daw engine model in lean mode. The objective is to reduce the cyclic dispersion in heat release by using the controller.</description><subject>Control systems</subject><subject>Engines</subject><subject>Learning</subject><subject>Neural networks</subject><subject>Neurofeedback</subject><subject>Nonlinear control systems</subject><subject>Observers</subject><subject>Output feedback</subject><subject>State estimation</subject><subject>Trajectory</subject><issn>0743-1619</issn><issn>2378-5861</issn><isbn>9781424409884</isbn><isbn>1424409888</isbn><isbn>1424409896</isbn><isbn>9781424409891</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2007</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo9kMtOwzAURM1Loi3dI7HxD6T4HXtZhRaQKipB95UT3xRD4kSxK9Qf4LsBUVjNSHN0FoPQNSUzSom5nRfFjBGSzwTTnLL8BI2pYEIQo406RSPGc51JregZmppc_21anKMRyQXPqKLmEo1jfCOEGqPICH0-gw91N1TQQkh4BXYIPuxwaSM4vN6nfp-yJYArbfWOiy6koWtwV-OnLjQ-fOM_LabBVwn_c3c-VgMkyJJvAb8cYoI24g-fXvG87xtf2eS7gFOHF2H3bYlX6KK2TYTpMSdos1xsiodstb5_LOarzBuSsry2TFuuaumUVbxUopbKaFdKkLUzzjJuHWjppGaydNrkxFnqQGirnKCaT9DNr9YDwLYffGuHw_Z4J_8CXBdn2A</recordid><startdate>200707</startdate><enddate>200707</enddate><creator>Shih, P.</creator><creator>Vance, J.</creator><creator>Kaul, B.</creator><creator>Jagannathan, S.</creator><creator>Drallmeier, J.A.</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>200707</creationdate><title>Reinforcement Learning based Output-Feedback Control of Nonlinear Nonstrict Feedback Discrete-time Systems with Application to Engines</title><author>Shih, P. ; Vance, J. ; Kaul, B. ; Jagannathan, S. ; Drallmeier, J.A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i90t-7fa28a36f5d6a63b64f5698db5e5fd9da23ade85d5825bd8970da1de48a6d4183</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2007</creationdate><topic>Control systems</topic><topic>Engines</topic><topic>Learning</topic><topic>Neural networks</topic><topic>Neurofeedback</topic><topic>Nonlinear control systems</topic><topic>Observers</topic><topic>Output feedback</topic><topic>State estimation</topic><topic>Trajectory</topic><toplevel>online_resources</toplevel><creatorcontrib>Shih, P.</creatorcontrib><creatorcontrib>Vance, J.</creatorcontrib><creatorcontrib>Kaul, B.</creatorcontrib><creatorcontrib>Jagannathan, S.</creatorcontrib><creatorcontrib>Drallmeier, J.A.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Shih, P.</au><au>Vance, J.</au><au>Kaul, B.</au><au>Jagannathan, S.</au><au>Drallmeier, J.A.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Reinforcement Learning based Output-Feedback Control of Nonlinear Nonstrict Feedback Discrete-time Systems with Application to Engines</atitle><btitle>2007 American Control Conference</btitle><stitle>ACC</stitle><date>2007-07</date><risdate>2007</risdate><spage>5106</spage><epage>5111</epage><pages>5106-5111</pages><issn>0743-1619</issn><eissn>2378-5861</eissn><isbn>9781424409884</isbn><isbn>1424409888</isbn><eisbn>1424409896</eisbn><eisbn>9781424409891</eisbn><abstract>A novel reinforcement-learning based output-adaptive neural network (NN) controller, also referred as the adaptive-critic NN controller, is developed to track a desired trajectory for a class of complex nonlinear discrete-time systems in the presence of bounded and unknown disturbances. The controller includes an observer for estimating states and the outputs, critic, and two action NNs for generating virtual, and actual control inputs. The critic approximates certain strategic utility function and the action NNs are used to minimize both the strategic utility function and their outputs. All NN weights adapt online towards minimization of a performance index, utilizing gradient-descent based rule. A Lyapunov function proves the uniformly ultimate boundedness (UUB) of the closed-loop tracking error, weight, and observer estimation. Separation principle and certainty equivalence principles are relaxed; persistency of excitation condition and linear in the unknown parameter assumption is not needed. The performance of this adaptive critic NN controller is evaluated through simulation with the Daw engine model in lean mode. The objective is to reduce the cyclic dispersion in heat release by using the controller.</abstract><pub>IEEE</pub><doi>10.1109/ACC.2007.4283127</doi><tpages>6</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0743-1619
ispartof	2007 American Control Conference, 2007, p.5106-5111
issn	0743-1619 2378-5861
language	eng
recordid	cdi_ieee_primary_4283127
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Control systems Engines Learning Neural networks Neurofeedback Nonlinear control systems Observers Output feedback State estimation Trajectory
title	Reinforcement Learning based Output-Feedback Control of Nonlinear Nonstrict Feedback Discrete-time Systems with Application to Engines
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-13T08%3A13%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Reinforcement%20Learning%20based%20Output-Feedback%20Control%20of%20Nonlinear%20Nonstrict%20Feedback%20Discrete-time%20Systems%20with%20Application%20to%20Engines&rft.btitle=2007%20American%20Control%20Conference&rft.au=Shih,%20P.&rft.date=2007-07&rft.spage=5106&rft.epage=5111&rft.pages=5106-5111&rft.issn=0743-1619&rft.eissn=2378-5861&rft.isbn=9781424409884&rft.isbn_list=1424409888&rft_id=info:doi/10.1109/ACC.2007.4283127&rft_dat=%3Cieee_6IE%3E4283127%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=1424409896&rft.eisbn_list=9781424409891&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=4283127&rfr_iscdi=true