Sparse-softmax: A Simpler and Faster Alternative Softmax Transformation
The softmax function is widely used in artificial neural networks for the multiclass classification problems, where the softmax transformation enforces the output to be positive and sum to one, and the corresponding loss function allows to use maximum likelihood principle to optimize the model. Howe...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The softmax function is widely used in artificial neural networks for the
multiclass classification problems, where the softmax transformation enforces
the output to be positive and sum to one, and the corresponding loss function
allows to use maximum likelihood principle to optimize the model. However,
softmax leaves a large margin for loss function to conduct optimizing operation
when it comes to high-dimensional classification, which results in
low-performance to some extent. In this paper, we provide an empirical study on
a simple and concise softmax variant, namely sparse-softmax, to alleviate the
problem that occurred in traditional softmax in terms of high-dimensional
classification problems. We evaluate our approach in several interdisciplinary
tasks, the experimental results show that sparse-softmax is simpler, faster,
and produces better results than the baseline models. |
---|---|
DOI: | 10.48550/arxiv.2112.12433 |