An Ode to an ODE
We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the orthogonal group O(d). This nested system of two flows, where the parameter-flow is constrained to lie on the compact manifold, provides sta...
Gespeichert in:
Hauptverfasser: | , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where
time-dependent parameters of the main flow evolve according to a matrix flow on
the orthogonal group O(d). This nested system of two flows, where the
parameter-flow is constrained to lie on the compact manifold, provides
stability and effectiveness of training and provably solves the gradient
vanishing-explosion problem which is intrinsically related to training deep
neural network architectures such as Neural ODEs. Consequently, it leads to
better downstream models, as we show on the example of training reinforcement
learning policies with evolution strategies, and in the supervised learning
setting, by comparing with previous SOTA baselines. We provide strong
convergence results for our proposed mechanism that are independent of the
depth of the network, supporting our empirical studies. Our results show an
intriguing connection between the theory of deep neural networks and the field
of matrix flows on compact manifolds. |
---|---|
DOI: | 10.48550/arxiv.2006.11421 |