PEA: Improving the Performance of ReLU Networks for Free by Using Progressive Ensemble Activations
In recent years novel activation functions have been proposed to improve the performance of neural networks, and they show superior performance compared to the ReLU counterpart. However, there are environments, where the availability of complex activations is limited, and usually only the ReLU is su...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In recent years novel activation functions have been proposed to improve the
performance of neural networks, and they show superior performance compared to
the ReLU counterpart. However, there are environments, where the availability
of complex activations is limited, and usually only the ReLU is supported. In
this paper we propose methods that can be used to improve the performance of
ReLU networks by using these efficient novel activations during model training.
More specifically, we propose ensemble activations that are composed of the
ReLU and one of these novel activations. Furthermore, the coefficients of the
ensemble are neither fixed nor learned, but are progressively updated during
the training process in a way that by the end of the training only the ReLU
activations remain active in the network and the other activations can be
removed. This means that in inference time the network contains ReLU
activations only. We perform extensive evaluations on the ImageNet
classification task using various compact network architectures and various
novel activation functions. Results show 0.2-0.8% top-1 accuracy gain, which
confirms the applicability of the proposed methods. Furthermore, we demonstrate
the proposed methods on semantic segmentation and we boost the performance of a
compact segmentation network by 0.34% mIOU on the Cityscapes dataset. |
---|---|
DOI: | 10.48550/arxiv.2207.14074 |