Building High-Throughput Neural Architecture Search Workflows via a Decoupled Fitness Prediction Engine
Neural networks (NN) are used in high-performance computing and high-throughput analysis to extract knowledge from datasets. Neural architecture search (NAS) automates NN design by generating, training, and analyzing thousands of NNs. However, NAS requires massive computational power for NN training...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on parallel and distributed systems 2022-01, Vol.33 (11), p.2913-2926 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Neural networks (NN) are used in high-performance computing and high-throughput analysis to extract knowledge from datasets. Neural architecture search (NAS) automates NN design by generating, training, and analyzing thousands of NNs. However, NAS requires massive computational power for NN training. To address challenges of efficiency and scalability, we propose PENGUIN , a decoupled fitness prediction engine that informs the search without interfering in it. PENGUIN uses parametric modeling to predict fitness of NNs. Existing NAS methods and parametric modeling functions can be plugged into PENGUIN to build flexible NAS workflows. Through this decoupling and flexible parametric modeling, PENGUIN reduces training costs: it predicts the fitness of NNs, enabling NAS to terminate training NNs early. Early termination increases the number of NNs that fixed compute resources can evaluate, thus giving NAS additional opportunity to find better NNs. We assess the effectiveness of our engine on 6,000 NNs across three diverse benchmark datasets and three state of the art NAS implementations using the Summit supercomputer. Augmenting these NAS implementations with PENGUIN can increase throughput by a factor of 1.6 to 7.1. Furthermore, walltime tests indicate that PENGUIN can reduce training time by a factor of 2.5 to 5.3. |
---|---|
ISSN: | 1045-9219 1558-2183 |
DOI: | 10.1109/TPDS.2022.3140681 |