Enhancing Sequential Model Performance with Squared Sigmoid TanH (SST) Activation Under Data Constraints

Activation functions enable neural networks to learn complex representations by introducing non-linearities. While feedforward models commonly use rectified linear units, sequential models like recurrent neural networks, long short-term memory (LSTMs) and gated recurrent units (GRUs) still rely on S...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Subramanian, Barathi, Jeyaraj, Rathinaraja, Ugli, Rakhmonov Akhrorjon Akhmadjon, Kim, Jeonghong
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Subramanian, Barathi Jeyaraj, Rathinaraja Ugli, Rakhmonov Akhrorjon Akhmadjon Kim, Jeonghong
description	Activation functions enable neural networks to learn complex representations by introducing non-linearities. While feedforward models commonly use rectified linear units, sequential models like recurrent neural networks, long short-term memory (LSTMs) and gated recurrent units (GRUs) still rely on Sigmoid and TanH activation functions. However, these classical activation functions often struggle to model sparse patterns when trained on small sequential datasets to effectively capture temporal dependencies. To address this limitation, we propose squared Sigmoid TanH (SST) activation specifically tailored to enhance the learning capability of sequential models under data constraints. SST applies mathematical squaring to amplify differences between strong and weak activations as signals propagate over time, facilitating improved gradient flow and information filtering. We evaluate SST-powered LSTMs and GRUs for diverse applications, such as sign language recognition, regression, and time-series classification tasks, where the dataset is limited. Our experiments demonstrate that SST models consistently outperform RNN-based models with baseline activations, exhibiting improved test accuracy.
doi_str_mv	10.48550/arxiv.2402.09034
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2402_09034</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2402_09034</sourcerecordid><originalsourceid>FETCH-LOGICAL-a674-36786a41e19894959ad9200644a0332aa883ae33d017ff2d1c05c6953978d8c33</originalsourceid><addsrcrecordid>eNotz7FOwzAUhWEvDKjwAEzcEYYEJ9dJ7LEKhSIVgZQwR1ex01hKHOq4pbw9tDCd4ZeO9DF2k_BYyCzjD-SP9hCngqcxVxzFJetXrifXWreFyuz2xgVLA7xO2gzwbnw3-fE3G_iyoYdqtydvNFR2O05WQ01uDXdVVd_Dsg32QMFODj6cNh4eKRCUk5uDJ-vCfMUuOhpmc_2_C1Y_repyHW3enl_K5SaivBAR5oXMSSQmUVIJlSnSKuU8F4I4YkokJZJB1Dwpui7VScuzNlcZqkJq2SIu2O3f7ZnafHo7kv9uTuTmTMYf8-VQXQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Enhancing Sequential Model Performance with Squared Sigmoid TanH (SST) Activation Under Data Constraints</title><source>arXiv.org</source><creator>Subramanian, Barathi ; Jeyaraj, Rathinaraja ; Ugli, Rakhmonov Akhrorjon Akhmadjon ; Kim, Jeonghong</creator><creatorcontrib>Subramanian, Barathi ; Jeyaraj, Rathinaraja ; Ugli, Rakhmonov Akhrorjon Akhmadjon ; Kim, Jeonghong</creatorcontrib><description>Activation functions enable neural networks to learn complex representations by introducing non-linearities. While feedforward models commonly use rectified linear units, sequential models like recurrent neural networks, long short-term memory (LSTMs) and gated recurrent units (GRUs) still rely on Sigmoid and TanH activation functions. However, these classical activation functions often struggle to model sparse patterns when trained on small sequential datasets to effectively capture temporal dependencies. To address this limitation, we propose squared Sigmoid TanH (SST) activation specifically tailored to enhance the learning capability of sequential models under data constraints. SST applies mathematical squaring to amplify differences between strong and weak activations as signals propagate over time, facilitating improved gradient flow and information filtering. We evaluate SST-powered LSTMs and GRUs for diverse applications, such as sign language recognition, regression, and time-series classification tasks, where the dataset is limited. Our experiments demonstrate that SST models consistently outperform RNN-based models with baseline activations, exhibiting improved test accuracy.</description><identifier>DOI: 10.48550/arxiv.2402.09034</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Learning</subject><creationdate>2024-02</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2402.09034$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2402.09034$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Subramanian, Barathi</creatorcontrib><creatorcontrib>Jeyaraj, Rathinaraja</creatorcontrib><creatorcontrib>Ugli, Rakhmonov Akhrorjon Akhmadjon</creatorcontrib><creatorcontrib>Kim, Jeonghong</creatorcontrib><title>Enhancing Sequential Model Performance with Squared Sigmoid TanH (SST) Activation Under Data Constraints</title><description>Activation functions enable neural networks to learn complex representations by introducing non-linearities. While feedforward models commonly use rectified linear units, sequential models like recurrent neural networks, long short-term memory (LSTMs) and gated recurrent units (GRUs) still rely on Sigmoid and TanH activation functions. However, these classical activation functions often struggle to model sparse patterns when trained on small sequential datasets to effectively capture temporal dependencies. To address this limitation, we propose squared Sigmoid TanH (SST) activation specifically tailored to enhance the learning capability of sequential models under data constraints. SST applies mathematical squaring to amplify differences between strong and weak activations as signals propagate over time, facilitating improved gradient flow and information filtering. We evaluate SST-powered LSTMs and GRUs for diverse applications, such as sign language recognition, regression, and time-series classification tasks, where the dataset is limited. Our experiments demonstrate that SST models consistently outperform RNN-based models with baseline activations, exhibiting improved test accuracy.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz7FOwzAUhWEvDKjwAEzcEYYEJ9dJ7LEKhSIVgZQwR1ex01hKHOq4pbw9tDCd4ZeO9DF2k_BYyCzjD-SP9hCngqcxVxzFJetXrifXWreFyuz2xgVLA7xO2gzwbnw3-fE3G_iyoYdqtydvNFR2O05WQ01uDXdVVd_Dsg32QMFODj6cNh4eKRCUk5uDJ-vCfMUuOhpmc_2_C1Y_repyHW3enl_K5SaivBAR5oXMSSQmUVIJlSnSKuU8F4I4YkokJZJB1Dwpui7VScuzNlcZqkJq2SIu2O3f7ZnafHo7kv9uTuTmTMYf8-VQXQ</recordid><startdate>20240214</startdate><enddate>20240214</enddate><creator>Subramanian, Barathi</creator><creator>Jeyaraj, Rathinaraja</creator><creator>Ugli, Rakhmonov Akhrorjon Akhmadjon</creator><creator>Kim, Jeonghong</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240214</creationdate><title>Enhancing Sequential Model Performance with Squared Sigmoid TanH (SST) Activation Under Data Constraints</title><author>Subramanian, Barathi ; Jeyaraj, Rathinaraja ; Ugli, Rakhmonov Akhrorjon Akhmadjon ; Kim, Jeonghong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a674-36786a41e19894959ad9200644a0332aa883ae33d017ff2d1c05c6953978d8c33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Subramanian, Barathi</creatorcontrib><creatorcontrib>Jeyaraj, Rathinaraja</creatorcontrib><creatorcontrib>Ugli, Rakhmonov Akhrorjon Akhmadjon</creatorcontrib><creatorcontrib>Kim, Jeonghong</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Subramanian, Barathi</au><au>Jeyaraj, Rathinaraja</au><au>Ugli, Rakhmonov Akhrorjon Akhmadjon</au><au>Kim, Jeonghong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Enhancing Sequential Model Performance with Squared Sigmoid TanH (SST) Activation Under Data Constraints</atitle><date>2024-02-14</date><risdate>2024</risdate><abstract>Activation functions enable neural networks to learn complex representations by introducing non-linearities. While feedforward models commonly use rectified linear units, sequential models like recurrent neural networks, long short-term memory (LSTMs) and gated recurrent units (GRUs) still rely on Sigmoid and TanH activation functions. However, these classical activation functions often struggle to model sparse patterns when trained on small sequential datasets to effectively capture temporal dependencies. To address this limitation, we propose squared Sigmoid TanH (SST) activation specifically tailored to enhance the learning capability of sequential models under data constraints. SST applies mathematical squaring to amplify differences between strong and weak activations as signals propagate over time, facilitating improved gradient flow and information filtering. We evaluate SST-powered LSTMs and GRUs for diverse applications, such as sign language recognition, regression, and time-series classification tasks, where the dataset is limited. Our experiments demonstrate that SST models consistently outperform RNN-based models with baseline activations, exhibiting improved test accuracy.</abstract><doi>10.48550/arxiv.2402.09034</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2402.09034
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2402_09034
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Learning
title	Enhancing Sequential Model Performance with Squared Sigmoid TanH (SST) Activation Under Data Constraints
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T15%3A15%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Enhancing%20Sequential%20Model%20Performance%20with%20Squared%20Sigmoid%20TanH%20(SST)%20Activation%20Under%20Data%20Constraints&rft.au=Subramanian,%20Barathi&rft.date=2024-02-14&rft_id=info:doi/10.48550/arxiv.2402.09034&rft_dat=%3Carxiv_GOX%3E2402_09034%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true