Model-Agnostic Zeroth-Order Policy Optimization for Meta-Learning of Ergodic Linear Quadratic Regulators
Meta-learning has been proposed as a promising machine learning topic in recent years, with important applications to image classification, robotics, computer games, and control systems. In this paper, we study the problem of using meta-learning to deal with uncertainty and heterogeneity in ergodic...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Meta-learning has been proposed as a promising machine learning topic in
recent years, with important applications to image classification, robotics,
computer games, and control systems. In this paper, we study the problem of
using meta-learning to deal with uncertainty and heterogeneity in ergodic
linear quadratic regulators. We integrate the zeroth-order optimization
technique with a typical meta-learning method, proposing an algorithm that
omits the estimation of policy Hessian, which applies to tasks of learning a
set of heterogeneous but similar linear dynamic systems. The induced
meta-objective function inherits important properties of the original cost
function when the set of linear dynamic systems are meta-learnable, allowing
the algorithm to optimize over a learnable landscape without projection onto
the feasible set. We provide a convergence result for the exact gradient
descent process by analyzing the boundedness and smoothness of the gradient for
the meta-objective, which justify the proposed algorithm with gradient
estimation error being small. We also provide a numerical example to
corroborate this perspective. |
---|---|
DOI: | 10.48550/arxiv.2405.17370 |