Neural Architecture Search of Hybrid Models for NPU-CIM Heterogeneous AR/VR Devices
Low-Latency and Low-Power Edge AI is essential for Virtual Reality and Augmented Reality applications. Recent advances show that hybrid models, combining convolution layers (CNN) and transformers (ViT), often achieve superior accuracy/performance tradeoff on various computer vision and machine learn...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Low-Latency and Low-Power Edge AI is essential for Virtual Reality and
Augmented Reality applications. Recent advances show that hybrid models,
combining convolution layers (CNN) and transformers (ViT), often achieve
superior accuracy/performance tradeoff on various computer vision and machine
learning (ML) tasks. However, hybrid ML models can pose system challenges for
latency and energy-efficiency due to their diverse nature in dataflow and
memory access patterns. In this work, we leverage the architecture
heterogeneity from Neural Processing Units (NPU) and Compute-In-Memory (CIM)
and perform diverse execution schemas to efficiently execute these hybrid
models. We also introduce H4H-NAS, a Neural Architecture Search framework to
design efficient hybrid CNN/ViT models for heterogeneous edge systems with both
NPU and CIM. Our H4H-NAS approach is powered by a performance estimator built
with NPU performance results measured on real silicon, and CIM performance
based on industry IPs. H4H-NAS searches hybrid CNN/ViT models with fine
granularity and achieves significant (up to 1.34%) top-1 accuracy improvement
on ImageNet dataset. Moreover, results from our Algo/HW co-design reveal up to
56.08% overall latency and 41.72% energy improvements by introducing such
heterogeneous computing over baseline solutions. The framework guides the
design of hybrid network architectures and system architectures of NPU+CIM
heterogeneous systems. |
---|---|
DOI: | 10.48550/arxiv.2410.08326 |