Effective Bilevel Optimization via Minimax Reformulation
Bilevel optimization has found successful applications in various machine learning problems, including hyper-parameter optimization, data cleaning, and meta-learning. However, its huge computational cost presents a significant challenge for its utilization in large-scale problems. This challenge ari...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Bilevel optimization has found successful applications in various machine
learning problems, including hyper-parameter optimization, data cleaning, and
meta-learning. However, its huge computational cost presents a significant
challenge for its utilization in large-scale problems. This challenge arises
due to the nested structure of the bilevel formulation, where each
hyper-gradient computation necessitates a costly inner optimization procedure.
To address this issue, we propose a reformulation of bilevel optimization as a
minimax problem, effectively decoupling the outer-inner dependency. Under mild
conditions, we show these two problems are equivalent. Furthermore, we
introduce a multi-stage gradient descent and ascent (GDA) algorithm to solve
the resulting minimax problem with convergence guarantees. Extensive
experimental results demonstrate that our method outperforms state-of-the-art
bilevel methods while significantly reducing the computational cost. |
---|---|
DOI: | 10.48550/arxiv.2305.13153 |