QuantNAS for Super Resolution: Searching for Efficient Quantization-Friendly Architectures Against Quantization Noise

This work aims to develop an automated procedure for discovering new, efficient solutions that can be effectively quantized in mixed-precision mode with minimal degradation. While our primary focus is on Super-Resolution (SR), our proposed procedure is applicable beyond this domain. To achieve our g...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2024, Vol.12, p.117008-117025
Hauptverfasser: Shvetsov, Egor, Osin, Dmitry, Zaytsev, Alexey, Koryakovskiy, Ivan, Buchnev, Valentin, Trofimov, Ilya, Burnaev, Evgeny
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This work aims to develop an automated procedure for discovering new, efficient solutions that can be effectively quantized in mixed-precision mode with minimal degradation. While our primary focus is on Super-Resolution (SR), our proposed procedure is applicable beyond this domain. To achieve our goals, we first develop an efficient Neural Architecture Search (NAS) procedure for full-precision (in this paper, "full-precision" or FP refers to floating point with a 32-bit data format) models, surpassing existing NAS solutions for SR. We then adapt this procedure for quantization-aware search. By introducing Quantization Noise (QN) during the search phase, we approximate the model degradation after quantization. Additionally, we improve search performance by implementing entropy regularization, which prioritizes operations and its precision within each search space block. Our experiments confirm the superiority of quantization-aware NAS compared to the two-step process: NAS followed by quantization. Furthermore, approximating quantization with QN offers a 30% speed improvement over direct weight quantization. We validate our approach by developing and applying it to two search spaces inspired by state-of-the-art SR models. Our code is publicly available (github.com/On-Point-RND/QuantNAS).
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2024.3446039