Learning-Based 6-DOF Control for Autonomous Proximity Operations Under Motion Constraints

This article proposes areinforcement learning (RL)-based six-degree-of-freedom (6-DOF) control scheme for the final-phase proximity operations of spacecraft. The main novelty of the proposed method are from two aspects: 1) The closed-loop performance can be improved in real-time through the RL techn...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on aerospace and electronic systems 2021-12, Vol.57 (6), p.4097-4109
Hauptverfasser: Hu, Qinglei, Yang, Haoyang, Dong, Hongyang, Zhao, Xiaowei
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 4109
container_issue 6
container_start_page 4097
container_title IEEE transactions on aerospace and electronic systems
container_volume 57
creator Hu, Qinglei
Yang, Haoyang
Dong, Hongyang
Zhao, Xiaowei
description This article proposes areinforcement learning (RL)-based six-degree-of-freedom (6-DOF) control scheme for the final-phase proximity operations of spacecraft. The main novelty of the proposed method are from two aspects: 1) The closed-loop performance can be improved in real-time through the RL technique, achieving an online approximate optimal control subject to the full 6-DOF nonlinear dynamics of spacecraft; 2) nontrivial motion constraints of proximity operations are considered and strictly obeyed during the whole control process. As a stepping stone, the dual-quaternion formalism is employed to characterize the 6-DOF dynamics model and motion constraints. Then, an RL-based control scheme is developed under the dual-quaternion algebraic framework to approximate the optimal control solution subject to a cost function and a Hamilton–Jacobi–Bellman equation. In addition, a specially designed barrier function is embedded in the reward function to avoid motion constraint violations. The Lyapunov-based stability analysis guarantees the ultimate boundedness of state errors and the weight of NN estimation errors. Besides, we also show that a PD-like controller under dual-quaternion formulation can be employed as the initial control policy to trigger the online learning process. The boundedness of it is proved by a special Lyapunov strictification method. Simulation results of prototypical spacecraft missions with proximity operations are provided to illustrate the effectiveness of the proposed method.
doi_str_mv 10.1109/TAES.2021.3094628
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TAES_2021_3094628</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9477061</ieee_id><sourcerecordid>2605706211</sourcerecordid><originalsourceid>FETCH-LOGICAL-c336t-18309475cc53efdc92aafb41dc043bb233a27573419fbed49cb0ecce61ec62be3</originalsourceid><addsrcrecordid>eNo9kFFLwzAUhYMoOKc_QHwJ-NyZm7Rp-zjnpsJkgtuDTyFNb6VjS2aSgvv3tkx8uhw459zDR8gtsAkAKx_W0_nHhDMOE8HKVPLijIwgy_KklEyckxFjUCQlz-CSXIWw7WVapGJEPpeovW3tV_KoA9ZUJk-rBZ05G73b0cZ5Ou2is27vukDfvftp92080tUBvY6ts4FubI2evrlBDcEQvW5tDNfkotG7gDd_d0w2i_l69pIsV8-vs-kyMULImEAxDM4zYzKBTW1KrnVTpVAbloqq4kJonme5SKFsKqzT0lQMjUEJaCSvUIzJ_an34N13hyGqreu87V8qLlmWM8kBehecXMa7EDw26uDbvfZHBUwNBNVAUA0E1R_BPnN3yrSI-O_vx_adIH4BpAptXg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2605706211</pqid></control><display><type>article</type><title>Learning-Based 6-DOF Control for Autonomous Proximity Operations Under Motion Constraints</title><source>IEEE Electronic Library (IEL)</source><creator>Hu, Qinglei ; Yang, Haoyang ; Dong, Hongyang ; Zhao, Xiaowei</creator><creatorcontrib>Hu, Qinglei ; Yang, Haoyang ; Dong, Hongyang ; Zhao, Xiaowei</creatorcontrib><description>This article proposes areinforcement learning (RL)-based six-degree-of-freedom (6-DOF) control scheme for the final-phase proximity operations of spacecraft. The main novelty of the proposed method are from two aspects: 1) The closed-loop performance can be improved in real-time through the RL technique, achieving an online approximate optimal control subject to the full 6-DOF nonlinear dynamics of spacecraft; 2) nontrivial motion constraints of proximity operations are considered and strictly obeyed during the whole control process. As a stepping stone, the dual-quaternion formalism is employed to characterize the 6-DOF dynamics model and motion constraints. Then, an RL-based control scheme is developed under the dual-quaternion algebraic framework to approximate the optimal control solution subject to a cost function and a Hamilton–Jacobi–Bellman equation. In addition, a specially designed barrier function is embedded in the reward function to avoid motion constraint violations. The Lyapunov-based stability analysis guarantees the ultimate boundedness of state errors and the weight of NN estimation errors. Besides, we also show that a PD-like controller under dual-quaternion formulation can be employed as the initial control policy to trigger the online learning process. The boundedness of it is proved by a special Lyapunov strictification method. Simulation results of prototypical spacecraft missions with proximity operations are provided to illustrate the effectiveness of the proposed method.</description><identifier>ISSN: 0018-9251</identifier><identifier>EISSN: 1557-9603</identifier><identifier>DOI: 10.1109/TAES.2021.3094628</identifier><identifier>CODEN: IEARAX</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Aerodynamics ; Approximate optimal control ; constrained 6-DOF control ; Constraint modelling ; Cost function ; Degrees of freedom ; Distance learning ; Errors ; Motion stability ; Nonlinear dynamics ; Optimal control ; Proximity ; Quaternions ; Real-time systems ; Reinforcement learning ; reinforcement learning (RL) ; Space vehicles ; Spacecraft ; spacecraft proximity operations ; Stability analysis ; Task analysis</subject><ispartof>IEEE transactions on aerospace and electronic systems, 2021-12, Vol.57 (6), p.4097-4109</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c336t-18309475cc53efdc92aafb41dc043bb233a27573419fbed49cb0ecce61ec62be3</citedby><cites>FETCH-LOGICAL-c336t-18309475cc53efdc92aafb41dc043bb233a27573419fbed49cb0ecce61ec62be3</cites><orcidid>0000-0002-1182-4502 ; 0000-0002-1646-0445 ; 0000-0002-5563-310X ; 0000-0003-4302-5323</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9477061$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9477061$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Hu, Qinglei</creatorcontrib><creatorcontrib>Yang, Haoyang</creatorcontrib><creatorcontrib>Dong, Hongyang</creatorcontrib><creatorcontrib>Zhao, Xiaowei</creatorcontrib><title>Learning-Based 6-DOF Control for Autonomous Proximity Operations Under Motion Constraints</title><title>IEEE transactions on aerospace and electronic systems</title><addtitle>T-AES</addtitle><description>This article proposes areinforcement learning (RL)-based six-degree-of-freedom (6-DOF) control scheme for the final-phase proximity operations of spacecraft. The main novelty of the proposed method are from two aspects: 1) The closed-loop performance can be improved in real-time through the RL technique, achieving an online approximate optimal control subject to the full 6-DOF nonlinear dynamics of spacecraft; 2) nontrivial motion constraints of proximity operations are considered and strictly obeyed during the whole control process. As a stepping stone, the dual-quaternion formalism is employed to characterize the 6-DOF dynamics model and motion constraints. Then, an RL-based control scheme is developed under the dual-quaternion algebraic framework to approximate the optimal control solution subject to a cost function and a Hamilton–Jacobi–Bellman equation. In addition, a specially designed barrier function is embedded in the reward function to avoid motion constraint violations. The Lyapunov-based stability analysis guarantees the ultimate boundedness of state errors and the weight of NN estimation errors. Besides, we also show that a PD-like controller under dual-quaternion formulation can be employed as the initial control policy to trigger the online learning process. The boundedness of it is proved by a special Lyapunov strictification method. Simulation results of prototypical spacecraft missions with proximity operations are provided to illustrate the effectiveness of the proposed method.</description><subject>Aerodynamics</subject><subject>Approximate optimal control</subject><subject>constrained 6-DOF control</subject><subject>Constraint modelling</subject><subject>Cost function</subject><subject>Degrees of freedom</subject><subject>Distance learning</subject><subject>Errors</subject><subject>Motion stability</subject><subject>Nonlinear dynamics</subject><subject>Optimal control</subject><subject>Proximity</subject><subject>Quaternions</subject><subject>Real-time systems</subject><subject>Reinforcement learning</subject><subject>reinforcement learning (RL)</subject><subject>Space vehicles</subject><subject>Spacecraft</subject><subject>spacecraft proximity operations</subject><subject>Stability analysis</subject><subject>Task analysis</subject><issn>0018-9251</issn><issn>1557-9603</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kFFLwzAUhYMoOKc_QHwJ-NyZm7Rp-zjnpsJkgtuDTyFNb6VjS2aSgvv3tkx8uhw459zDR8gtsAkAKx_W0_nHhDMOE8HKVPLijIwgy_KklEyckxFjUCQlz-CSXIWw7WVapGJEPpeovW3tV_KoA9ZUJk-rBZ05G73b0cZ5Ou2is27vukDfvftp92080tUBvY6ts4FubI2evrlBDcEQvW5tDNfkotG7gDd_d0w2i_l69pIsV8-vs-kyMULImEAxDM4zYzKBTW1KrnVTpVAbloqq4kJonme5SKFsKqzT0lQMjUEJaCSvUIzJ_an34N13hyGqreu87V8qLlmWM8kBehecXMa7EDw26uDbvfZHBUwNBNVAUA0E1R_BPnN3yrSI-O_vx_adIH4BpAptXg</recordid><startdate>20211201</startdate><enddate>20211201</enddate><creator>Hu, Qinglei</creator><creator>Yang, Haoyang</creator><creator>Dong, Hongyang</creator><creator>Zhao, Xiaowei</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>7TB</scope><scope>8FD</scope><scope>FR3</scope><scope>H8D</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0002-1182-4502</orcidid><orcidid>https://orcid.org/0000-0002-1646-0445</orcidid><orcidid>https://orcid.org/0000-0002-5563-310X</orcidid><orcidid>https://orcid.org/0000-0003-4302-5323</orcidid></search><sort><creationdate>20211201</creationdate><title>Learning-Based 6-DOF Control for Autonomous Proximity Operations Under Motion Constraints</title><author>Hu, Qinglei ; Yang, Haoyang ; Dong, Hongyang ; Zhao, Xiaowei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c336t-18309475cc53efdc92aafb41dc043bb233a27573419fbed49cb0ecce61ec62be3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Aerodynamics</topic><topic>Approximate optimal control</topic><topic>constrained 6-DOF control</topic><topic>Constraint modelling</topic><topic>Cost function</topic><topic>Degrees of freedom</topic><topic>Distance learning</topic><topic>Errors</topic><topic>Motion stability</topic><topic>Nonlinear dynamics</topic><topic>Optimal control</topic><topic>Proximity</topic><topic>Quaternions</topic><topic>Real-time systems</topic><topic>Reinforcement learning</topic><topic>reinforcement learning (RL)</topic><topic>Space vehicles</topic><topic>Spacecraft</topic><topic>spacecraft proximity operations</topic><topic>Stability analysis</topic><topic>Task analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hu, Qinglei</creatorcontrib><creatorcontrib>Yang, Haoyang</creatorcontrib><creatorcontrib>Dong, Hongyang</creatorcontrib><creatorcontrib>Zhao, Xiaowei</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on aerospace and electronic systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hu, Qinglei</au><au>Yang, Haoyang</au><au>Dong, Hongyang</au><au>Zhao, Xiaowei</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Learning-Based 6-DOF Control for Autonomous Proximity Operations Under Motion Constraints</atitle><jtitle>IEEE transactions on aerospace and electronic systems</jtitle><stitle>T-AES</stitle><date>2021-12-01</date><risdate>2021</risdate><volume>57</volume><issue>6</issue><spage>4097</spage><epage>4109</epage><pages>4097-4109</pages><issn>0018-9251</issn><eissn>1557-9603</eissn><coden>IEARAX</coden><abstract>This article proposes areinforcement learning (RL)-based six-degree-of-freedom (6-DOF) control scheme for the final-phase proximity operations of spacecraft. The main novelty of the proposed method are from two aspects: 1) The closed-loop performance can be improved in real-time through the RL technique, achieving an online approximate optimal control subject to the full 6-DOF nonlinear dynamics of spacecraft; 2) nontrivial motion constraints of proximity operations are considered and strictly obeyed during the whole control process. As a stepping stone, the dual-quaternion formalism is employed to characterize the 6-DOF dynamics model and motion constraints. Then, an RL-based control scheme is developed under the dual-quaternion algebraic framework to approximate the optimal control solution subject to a cost function and a Hamilton–Jacobi–Bellman equation. In addition, a specially designed barrier function is embedded in the reward function to avoid motion constraint violations. The Lyapunov-based stability analysis guarantees the ultimate boundedness of state errors and the weight of NN estimation errors. Besides, we also show that a PD-like controller under dual-quaternion formulation can be employed as the initial control policy to trigger the online learning process. The boundedness of it is proved by a special Lyapunov strictification method. Simulation results of prototypical spacecraft missions with proximity operations are provided to illustrate the effectiveness of the proposed method.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TAES.2021.3094628</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-1182-4502</orcidid><orcidid>https://orcid.org/0000-0002-1646-0445</orcidid><orcidid>https://orcid.org/0000-0002-5563-310X</orcidid><orcidid>https://orcid.org/0000-0003-4302-5323</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0018-9251
ispartof IEEE transactions on aerospace and electronic systems, 2021-12, Vol.57 (6), p.4097-4109
issn 0018-9251
1557-9603
language eng
recordid cdi_crossref_primary_10_1109_TAES_2021_3094628
source IEEE Electronic Library (IEL)
subjects Aerodynamics
Approximate optimal control
constrained 6-DOF control
Constraint modelling
Cost function
Degrees of freedom
Distance learning
Errors
Motion stability
Nonlinear dynamics
Optimal control
Proximity
Quaternions
Real-time systems
Reinforcement learning
reinforcement learning (RL)
Space vehicles
Spacecraft
spacecraft proximity operations
Stability analysis
Task analysis
title Learning-Based 6-DOF Control for Autonomous Proximity Operations Under Motion Constraints
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T11%3A27%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Learning-Based%206-DOF%20Control%20for%20Autonomous%20Proximity%20Operations%20Under%20Motion%20Constraints&rft.jtitle=IEEE%20transactions%20on%20aerospace%20and%20electronic%20systems&rft.au=Hu,%20Qinglei&rft.date=2021-12-01&rft.volume=57&rft.issue=6&rft.spage=4097&rft.epage=4109&rft.pages=4097-4109&rft.issn=0018-9251&rft.eissn=1557-9603&rft.coden=IEARAX&rft_id=info:doi/10.1109/TAES.2021.3094628&rft_dat=%3Cproquest_RIE%3E2605706211%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2605706211&rft_id=info:pmid/&rft_ieee_id=9477061&rfr_iscdi=true