WaveMix: A Resource-efficient Neural Network for Image Analysis
We propose a novel neural architecture for computer vision -- WaveMix -- that is resource-efficient and yet generalizable and scalable. While using fewer trainable parameters, GPU RAM, and computations, WaveMix networks achieve comparable or better accuracy than the state-of-the-art convolutional ne...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We propose a novel neural architecture for computer vision -- WaveMix -- that
is resource-efficient and yet generalizable and scalable. While using fewer
trainable parameters, GPU RAM, and computations, WaveMix networks achieve
comparable or better accuracy than the state-of-the-art convolutional neural
networks, vision transformers, and token mixers for several tasks. This
efficiency can translate to savings in time, cost, and energy. To achieve these
gains we used multi-level two-dimensional discrete wavelet transform (2D-DWT)
in WaveMix blocks, which has the following advantages: (1) It reorganizes
spatial information based on three strong image priors -- scale-invariance,
shift-invariance, and sparseness of edges -- (2) in a lossless manner without
adding parameters, (3) while also reducing the spatial sizes of feature maps,
which reduces the memory and time required for forward and backward passes, and
(4) expanding the receptive field faster than convolutions do. The whole
architecture is a stack of self-similar and resolution-preserving WaveMix
blocks, which allows architectural flexibility for various tasks and levels of
resource availability. WaveMix establishes new benchmarks for segmentation on
Cityscapes; and for classification on Galaxy 10 DECals, Places-365, five EMNIST
datasets, and iNAT-mini and performs competitively on other benchmarks. Our
code and trained models are publicly available. |
---|---|
DOI: | 10.48550/arxiv.2205.14375 |