Sparse-softmax: A Simpler and Faster Alternative Softmax Transformation
The softmax function is widely used in artificial neural networks for the multiclass classification problems, where the softmax transformation enforces the output to be positive and sum to one, and the corresponding loss function allows to use maximum likelihood principle to optimize the model. Howe...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Sun, Shaoshi Zhang, Zhenyuan Huang, BoCheng Lei, Pengbin Su, Jianlin Pan, Shengfeng Cao, Jiarun |
description | The softmax function is widely used in artificial neural networks for the
multiclass classification problems, where the softmax transformation enforces
the output to be positive and sum to one, and the corresponding loss function
allows to use maximum likelihood principle to optimize the model. However,
softmax leaves a large margin for loss function to conduct optimizing operation
when it comes to high-dimensional classification, which results in
low-performance to some extent. In this paper, we provide an empirical study on
a simple and concise softmax variant, namely sparse-softmax, to alleviate the
problem that occurred in traditional softmax in terms of high-dimensional
classification problems. We evaluate our approach in several interdisciplinary
tasks, the experimental results show that sparse-softmax is simpler, faster,
and produces better results than the baseline models. |
doi_str_mv | 10.48550/arxiv.2112.12433 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2112_12433</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2112_12433</sourcerecordid><originalsourceid>FETCH-LOGICAL-a673-5f0a4e4ae3d454c5a019391d2c6d073072111a710aa61f99e27a581107d2f6413</originalsourceid><addsrcrecordid>eNotj8tqwzAURLXpoqT5gK6qH7Crq4cVd2dCkxYCXdh7M0QSGPxCMiH9-7pJNjPDMFzuYeyVRK53xoh3xGt3ySWRzElqpZ7ZsZ4Rk8_SFJYB1w9e8bob5t5HjtHxA9KyxqpfdcTSXTyv70veRIwpTHFY62l8YU8BffLbh29Yc_hs9l_Z6ef4va9OGQqrMhMEtNfwymmjzwaCSlWSk-fCCauEXV8jWBJAQaEsvbQwOyJhnQyFJrVhb_ezN5J2jt2A-Nv-E7U3IvUHD9NFRw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Sparse-softmax: A Simpler and Faster Alternative Softmax Transformation</title><source>arXiv.org</source><creator>Sun, Shaoshi ; Zhang, Zhenyuan ; Huang, BoCheng ; Lei, Pengbin ; Su, Jianlin ; Pan, Shengfeng ; Cao, Jiarun</creator><creatorcontrib>Sun, Shaoshi ; Zhang, Zhenyuan ; Huang, BoCheng ; Lei, Pengbin ; Su, Jianlin ; Pan, Shengfeng ; Cao, Jiarun</creatorcontrib><description>The softmax function is widely used in artificial neural networks for the
multiclass classification problems, where the softmax transformation enforces
the output to be positive and sum to one, and the corresponding loss function
allows to use maximum likelihood principle to optimize the model. However,
softmax leaves a large margin for loss function to conduct optimizing operation
when it comes to high-dimensional classification, which results in
low-performance to some extent. In this paper, we provide an empirical study on
a simple and concise softmax variant, namely sparse-softmax, to alleviate the
problem that occurred in traditional softmax in terms of high-dimensional
classification problems. We evaluate our approach in several interdisciplinary
tasks, the experimental results show that sparse-softmax is simpler, faster,
and produces better results than the baseline models.</description><identifier>DOI: 10.48550/arxiv.2112.12433</identifier><language>eng</language><subject>Computer Science - Computation and Language ; Computer Science - Learning</subject><creationdate>2021-12</creationdate><rights>http://creativecommons.org/licenses/by-nc-nd/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2112.12433$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2112.12433$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Sun, Shaoshi</creatorcontrib><creatorcontrib>Zhang, Zhenyuan</creatorcontrib><creatorcontrib>Huang, BoCheng</creatorcontrib><creatorcontrib>Lei, Pengbin</creatorcontrib><creatorcontrib>Su, Jianlin</creatorcontrib><creatorcontrib>Pan, Shengfeng</creatorcontrib><creatorcontrib>Cao, Jiarun</creatorcontrib><title>Sparse-softmax: A Simpler and Faster Alternative Softmax Transformation</title><description>The softmax function is widely used in artificial neural networks for the
multiclass classification problems, where the softmax transformation enforces
the output to be positive and sum to one, and the corresponding loss function
allows to use maximum likelihood principle to optimize the model. However,
softmax leaves a large margin for loss function to conduct optimizing operation
when it comes to high-dimensional classification, which results in
low-performance to some extent. In this paper, we provide an empirical study on
a simple and concise softmax variant, namely sparse-softmax, to alleviate the
problem that occurred in traditional softmax in terms of high-dimensional
classification problems. We evaluate our approach in several interdisciplinary
tasks, the experimental results show that sparse-softmax is simpler, faster,
and produces better results than the baseline models.</description><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj8tqwzAURLXpoqT5gK6qH7Crq4cVd2dCkxYCXdh7M0QSGPxCMiH9-7pJNjPDMFzuYeyVRK53xoh3xGt3ySWRzElqpZ7ZsZ4Rk8_SFJYB1w9e8bob5t5HjtHxA9KyxqpfdcTSXTyv70veRIwpTHFY62l8YU8BffLbh29Yc_hs9l_Z6ef4va9OGQqrMhMEtNfwymmjzwaCSlWSk-fCCauEXV8jWBJAQaEsvbQwOyJhnQyFJrVhb_ezN5J2jt2A-Nv-E7U3IvUHD9NFRw</recordid><startdate>20211223</startdate><enddate>20211223</enddate><creator>Sun, Shaoshi</creator><creator>Zhang, Zhenyuan</creator><creator>Huang, BoCheng</creator><creator>Lei, Pengbin</creator><creator>Su, Jianlin</creator><creator>Pan, Shengfeng</creator><creator>Cao, Jiarun</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20211223</creationdate><title>Sparse-softmax: A Simpler and Faster Alternative Softmax Transformation</title><author>Sun, Shaoshi ; Zhang, Zhenyuan ; Huang, BoCheng ; Lei, Pengbin ; Su, Jianlin ; Pan, Shengfeng ; Cao, Jiarun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a673-5f0a4e4ae3d454c5a019391d2c6d073072111a710aa61f99e27a581107d2f6413</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Sun, Shaoshi</creatorcontrib><creatorcontrib>Zhang, Zhenyuan</creatorcontrib><creatorcontrib>Huang, BoCheng</creatorcontrib><creatorcontrib>Lei, Pengbin</creatorcontrib><creatorcontrib>Su, Jianlin</creatorcontrib><creatorcontrib>Pan, Shengfeng</creatorcontrib><creatorcontrib>Cao, Jiarun</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Sun, Shaoshi</au><au>Zhang, Zhenyuan</au><au>Huang, BoCheng</au><au>Lei, Pengbin</au><au>Su, Jianlin</au><au>Pan, Shengfeng</au><au>Cao, Jiarun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Sparse-softmax: A Simpler and Faster Alternative Softmax Transformation</atitle><date>2021-12-23</date><risdate>2021</risdate><abstract>The softmax function is widely used in artificial neural networks for the
multiclass classification problems, where the softmax transformation enforces
the output to be positive and sum to one, and the corresponding loss function
allows to use maximum likelihood principle to optimize the model. However,
softmax leaves a large margin for loss function to conduct optimizing operation
when it comes to high-dimensional classification, which results in
low-performance to some extent. In this paper, we provide an empirical study on
a simple and concise softmax variant, namely sparse-softmax, to alleviate the
problem that occurred in traditional softmax in terms of high-dimensional
classification problems. We evaluate our approach in several interdisciplinary
tasks, the experimental results show that sparse-softmax is simpler, faster,
and produces better results than the baseline models.</abstract><doi>10.48550/arxiv.2112.12433</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2112.12433 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2112_12433 |
source | arXiv.org |
subjects | Computer Science - Computation and Language Computer Science - Learning |
title | Sparse-softmax: A Simpler and Faster Alternative Softmax Transformation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T14%3A56%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Sparse-softmax:%20A%20Simpler%20and%20Faster%20Alternative%20Softmax%20Transformation&rft.au=Sun,%20Shaoshi&rft.date=2021-12-23&rft_id=info:doi/10.48550/arxiv.2112.12433&rft_dat=%3Carxiv_GOX%3E2112_12433%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |