DC and SA: Robust and Efficient Hyperparameter Optimization of Multi-subnetwork Deep Learning Models
We present two novel hyperparameter optimization strategies for optimization of deep learning models with a modular architecture constructed of multiple subnetworks. As complex networks with multiple subnetworks become more frequently applied in machine learning, hyperparameter optimization methods...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We present two novel hyperparameter optimization strategies for optimization
of deep learning models with a modular architecture constructed of multiple
subnetworks. As complex networks with multiple subnetworks become more
frequently applied in machine learning, hyperparameter optimization methods are
required to efficiently optimize their hyperparameters. Existing hyperparameter
searches are general, and can be used to optimize such networks, however, by
exploiting the multi-subnetwork architecture, these searches can be sped up
substantially. The proposed methods offer faster convergence to a
better-performing final model. To demonstrate this, we propose 2 independent
approaches to enhance these prior algorithms: 1) a divide-and-conquer approach,
in which the best subnetworks of top-performing models are combined, allowing
for more rapid sampling of the hyperparameter search space. 2) A subnetwork
adaptive approach that distributes computational resources based on the
importance of each subnetwork, allowing more intelligent resource allocation.
These approaches can be flexibily applied to many hyperparameter optimization
algorithms. To illustrate this, we combine our approaches with the
commonly-used Bayesian optimization method. Our approaches are then tested
against both synthetic examples and real-world examples and applied to multiple
network types including convolutional neural networks and dense feed forward
neural networks. Our approaches show an increased optimization efficiency of up
to 23.62x, and a final performance boost of up to 3.5% accuracy for
classification and 4.4 MSE for regression, when compared to comparable BO
approach. |
---|---|
DOI: | 10.48550/arxiv.2202.11841 |