Ternary Policy Iteration Algorithm for Nonlinear Robust Control
The uncertainties in plant dynamics remain a challenge for nonlinear control problems. This paper develops a ternary policy iteration (TPI) algorithm for solving nonlinear robust control problems with bounded uncertainties. The controller and uncertainty of the system are considered as game players,...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The uncertainties in plant dynamics remain a challenge for nonlinear control
problems. This paper develops a ternary policy iteration (TPI) algorithm for
solving nonlinear robust control problems with bounded uncertainties. The
controller and uncertainty of the system are considered as game players, and
the robust control problem is formulated as a two-player zero-sum differential
game. In order to solve the differential game, the corresponding
Hamilton-Jacobi-Isaacs (HJI) equation is then derived. Three loss functions and
three update phases are designed to match the identity equation, minimization
and maximization of the HJI equation, respectively. These loss functions are
defined by the expectation of the approximate Hamiltonian in a generated state
set to prevent operating all the states in the entire state set concurrently.
The parameters of value function and policies are directly updated by
diminishing the designed loss functions using the gradient descent method.
Moreover, zero-initialization can be applied to the parameters of the control
policy. The effectiveness of the proposed TPI algorithm is demonstrated through
two simulation studies. The simulation results show that the TPI algorithm can
converge to the optimal solution for the linear plant, and has high resistance
to disturbances for the nonlinear plant. |
---|---|
DOI: | 10.48550/arxiv.2007.06810 |