Sparse-softmax: A Simpler and Faster Alternative Softmax Transformation

The softmax function is widely used in artificial neural networks for the multiclass classification problems, where the softmax transformation enforces the output to be positive and sum to one, and the corresponding loss function allows to use maximum likelihood principle to optimize the model. Howe...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Sun, Shaoshi, Zhang, Zhenyuan, Huang, BoCheng, Lei, Pengbin, Su, Jianlin, Pan, Shengfeng, Cao, Jiarun
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computation and Language Computer Science - Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Sun, Shaoshi Zhang, Zhenyuan Huang, BoCheng Lei, Pengbin Su, Jianlin Pan, Shengfeng Cao, Jiarun
description	The softmax function is widely used in artificial neural networks for the multiclass classification problems, where the softmax transformation enforces the output to be positive and sum to one, and the corresponding loss function allows to use maximum likelihood principle to optimize the model. However, softmax leaves a large margin for loss function to conduct optimizing operation when it comes to high-dimensional classification, which results in low-performance to some extent. In this paper, we provide an empirical study on a simple and concise softmax variant, namely sparse-softmax, to alleviate the problem that occurred in traditional softmax in terms of high-dimensional classification problems. We evaluate our approach in several interdisciplinary tasks, the experimental results show that sparse-softmax is simpler, faster, and produces better results than the baseline models.
doi_str_mv	10.48550/arxiv.2112.12433
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2112_12433</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2112_12433</sourcerecordid><originalsourceid>FETCH-LOGICAL-a673-5f0a4e4ae3d454c5a019391d2c6d073072111a710aa61f99e27a581107d2f6413</originalsourceid><addsrcrecordid>eNotj8tqwzAURLXpoqT5gK6qH7Crq4cVd2dCkxYCXdh7M0QSGPxCMiH9-7pJNjPDMFzuYeyVRK53xoh3xGt3ySWRzElqpZ7ZsZ4Rk8_SFJYB1w9e8bob5t5HjtHxA9KyxqpfdcTSXTyv70veRIwpTHFY62l8YU8BffLbh29Yc_hs9l_Z6ef4va9OGQqrMhMEtNfwymmjzwaCSlWSk-fCCauEXV8jWBJAQaEsvbQwOyJhnQyFJrVhb_ezN5J2jt2A-Nv-E7U3IvUHD9NFRw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Sparse-softmax: A Simpler and Faster Alternative Softmax Transformation</title><source>arXiv.org</source><creator>Sun, Shaoshi ; Zhang, Zhenyuan ; Huang, BoCheng ; Lei, Pengbin ; Su, Jianlin ; Pan, Shengfeng ; Cao, Jiarun</creator><creatorcontrib>Sun, Shaoshi ; Zhang, Zhenyuan ; Huang, BoCheng ; Lei, Pengbin ; Su, Jianlin ; Pan, Shengfeng ; Cao, Jiarun</creatorcontrib><description>The softmax function is widely used in artificial neural networks for the multiclass classification problems, where the softmax transformation enforces the output to be positive and sum to one, and the corresponding loss function allows to use maximum likelihood principle to optimize the model. However, softmax leaves a large margin for loss function to conduct optimizing operation when it comes to high-dimensional classification, which results in low-performance to some extent. In this paper, we provide an empirical study on a simple and concise softmax variant, namely sparse-softmax, to alleviate the problem that occurred in traditional softmax in terms of high-dimensional classification problems. We evaluate our approach in several interdisciplinary tasks, the experimental results show that sparse-softmax is simpler, faster, and produces better results than the baseline models.</description><identifier>DOI: 10.48550/arxiv.2112.12433</identifier><language>eng</language><subject>Computer Science - Computation and Language ; Computer Science - Learning</subject><creationdate>2021-12</creationdate><rights>http://creativecommons.org/licenses/by-nc-nd/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2112.12433$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2112.12433$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Sun, Shaoshi</creatorcontrib><creatorcontrib>Zhang, Zhenyuan</creatorcontrib><creatorcontrib>Huang, BoCheng</creatorcontrib><creatorcontrib>Lei, Pengbin</creatorcontrib><creatorcontrib>Su, Jianlin</creatorcontrib><creatorcontrib>Pan, Shengfeng</creatorcontrib><creatorcontrib>Cao, Jiarun</creatorcontrib><title>Sparse-softmax: A Simpler and Faster Alternative Softmax Transformation</title><description>The softmax function is widely used in artificial neural networks for the multiclass classification problems, where the softmax transformation enforces the output to be positive and sum to one, and the corresponding loss function allows to use maximum likelihood principle to optimize the model. However, softmax leaves a large margin for loss function to conduct optimizing operation when it comes to high-dimensional classification, which results in low-performance to some extent. In this paper, we provide an empirical study on a simple and concise softmax variant, namely sparse-softmax, to alleviate the problem that occurred in traditional softmax in terms of high-dimensional classification problems. We evaluate our approach in several interdisciplinary tasks, the experimental results show that sparse-softmax is simpler, faster, and produces better results than the baseline models.</description><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj8tqwzAURLXpoqT5gK6qH7Crq4cVd2dCkxYCXdh7M0QSGPxCMiH9-7pJNjPDMFzuYeyVRK53xoh3xGt3ySWRzElqpZ7ZsZ4Rk8_SFJYB1w9e8bob5t5HjtHxA9KyxqpfdcTSXTyv70veRIwpTHFY62l8YU8BffLbh29Yc_hs9l_Z6ef4va9OGQqrMhMEtNfwymmjzwaCSlWSk-fCCauEXV8jWBJAQaEsvbQwOyJhnQyFJrVhb_ezN5J2jt2A-Nv-E7U3IvUHD9NFRw</recordid><startdate>20211223</startdate><enddate>20211223</enddate><creator>Sun, Shaoshi</creator><creator>Zhang, Zhenyuan</creator><creator>Huang, BoCheng</creator><creator>Lei, Pengbin</creator><creator>Su, Jianlin</creator><creator>Pan, Shengfeng</creator><creator>Cao, Jiarun</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20211223</creationdate><title>Sparse-softmax: A Simpler and Faster Alternative Softmax Transformation</title><author>Sun, Shaoshi ; Zhang, Zhenyuan ; Huang, BoCheng ; Lei, Pengbin ; Su, Jianlin ; Pan, Shengfeng ; Cao, Jiarun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a673-5f0a4e4ae3d454c5a019391d2c6d073072111a710aa61f99e27a581107d2f6413</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Sun, Shaoshi</creatorcontrib><creatorcontrib>Zhang, Zhenyuan</creatorcontrib><creatorcontrib>Huang, BoCheng</creatorcontrib><creatorcontrib>Lei, Pengbin</creatorcontrib><creatorcontrib>Su, Jianlin</creatorcontrib><creatorcontrib>Pan, Shengfeng</creatorcontrib><creatorcontrib>Cao, Jiarun</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Sun, Shaoshi</au><au>Zhang, Zhenyuan</au><au>Huang, BoCheng</au><au>Lei, Pengbin</au><au>Su, Jianlin</au><au>Pan, Shengfeng</au><au>Cao, Jiarun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Sparse-softmax: A Simpler and Faster Alternative Softmax Transformation</atitle><date>2021-12-23</date><risdate>2021</risdate><abstract>The softmax function is widely used in artificial neural networks for the multiclass classification problems, where the softmax transformation enforces the output to be positive and sum to one, and the corresponding loss function allows to use maximum likelihood principle to optimize the model. However, softmax leaves a large margin for loss function to conduct optimizing operation when it comes to high-dimensional classification, which results in low-performance to some extent. In this paper, we provide an empirical study on a simple and concise softmax variant, namely sparse-softmax, to alleviate the problem that occurred in traditional softmax in terms of high-dimensional classification problems. We evaluate our approach in several interdisciplinary tasks, the experimental results show that sparse-softmax is simpler, faster, and produces better results than the baseline models.</abstract><doi>10.48550/arxiv.2112.12433</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2112.12433
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2112_12433
source	arXiv.org
subjects	Computer Science - Computation and Language Computer Science - Learning
title	Sparse-softmax: A Simpler and Faster Alternative Softmax Transformation
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T14%3A56%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Sparse-softmax:%20A%20Simpler%20and%20Faster%20Alternative%20Softmax%20Transformation&rft.au=Sun,%20Shaoshi&rft.date=2021-12-23&rft_id=info:doi/10.48550/arxiv.2112.12433&rft_dat=%3Carxiv_GOX%3E2112_12433%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true