Enhancing Sequential Model Performance with Squared Sigmoid TanH (SST) Activation Under Data Constraints

Activation functions enable neural networks to learn complex representations by introducing non-linearities. While feedforward models commonly use rectified linear units, sequential models like recurrent neural networks, long short-term memory (LSTMs) and gated recurrent units (GRUs) still rely on S...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Subramanian, Barathi, Jeyaraj, Rathinaraja, Ugli, Rakhmonov Akhrorjon Akhmadjon, Kim, Jeonghong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Subramanian, Barathi
Jeyaraj, Rathinaraja
Ugli, Rakhmonov Akhrorjon Akhmadjon
Kim, Jeonghong
description Activation functions enable neural networks to learn complex representations by introducing non-linearities. While feedforward models commonly use rectified linear units, sequential models like recurrent neural networks, long short-term memory (LSTMs) and gated recurrent units (GRUs) still rely on Sigmoid and TanH activation functions. However, these classical activation functions often struggle to model sparse patterns when trained on small sequential datasets to effectively capture temporal dependencies. To address this limitation, we propose squared Sigmoid TanH (SST) activation specifically tailored to enhance the learning capability of sequential models under data constraints. SST applies mathematical squaring to amplify differences between strong and weak activations as signals propagate over time, facilitating improved gradient flow and information filtering. We evaluate SST-powered LSTMs and GRUs for diverse applications, such as sign language recognition, regression, and time-series classification tasks, where the dataset is limited. Our experiments demonstrate that SST models consistently outperform RNN-based models with baseline activations, exhibiting improved test accuracy.
doi_str_mv 10.48550/arxiv.2402.09034
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2402_09034</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2402_09034</sourcerecordid><originalsourceid>FETCH-LOGICAL-a674-36786a41e19894959ad9200644a0332aa883ae33d017ff2d1c05c6953978d8c33</originalsourceid><addsrcrecordid>eNotz7FOwzAUhWEvDKjwAEzcEYYEJ9dJ7LEKhSIVgZQwR1ex01hKHOq4pbw9tDCd4ZeO9DF2k_BYyCzjD-SP9hCngqcxVxzFJetXrifXWreFyuz2xgVLA7xO2gzwbnw3-fE3G_iyoYdqtydvNFR2O05WQ01uDXdVVd_Dsg32QMFODj6cNh4eKRCUk5uDJ-vCfMUuOhpmc_2_C1Y_repyHW3enl_K5SaivBAR5oXMSSQmUVIJlSnSKuU8F4I4YkokJZJB1Dwpui7VScuzNlcZqkJq2SIu2O3f7ZnafHo7kv9uTuTmTMYf8-VQXQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Enhancing Sequential Model Performance with Squared Sigmoid TanH (SST) Activation Under Data Constraints</title><source>arXiv.org</source><creator>Subramanian, Barathi ; Jeyaraj, Rathinaraja ; Ugli, Rakhmonov Akhrorjon Akhmadjon ; Kim, Jeonghong</creator><creatorcontrib>Subramanian, Barathi ; Jeyaraj, Rathinaraja ; Ugli, Rakhmonov Akhrorjon Akhmadjon ; Kim, Jeonghong</creatorcontrib><description>Activation functions enable neural networks to learn complex representations by introducing non-linearities. While feedforward models commonly use rectified linear units, sequential models like recurrent neural networks, long short-term memory (LSTMs) and gated recurrent units (GRUs) still rely on Sigmoid and TanH activation functions. However, these classical activation functions often struggle to model sparse patterns when trained on small sequential datasets to effectively capture temporal dependencies. To address this limitation, we propose squared Sigmoid TanH (SST) activation specifically tailored to enhance the learning capability of sequential models under data constraints. SST applies mathematical squaring to amplify differences between strong and weak activations as signals propagate over time, facilitating improved gradient flow and information filtering. We evaluate SST-powered LSTMs and GRUs for diverse applications, such as sign language recognition, regression, and time-series classification tasks, where the dataset is limited. Our experiments demonstrate that SST models consistently outperform RNN-based models with baseline activations, exhibiting improved test accuracy.</description><identifier>DOI: 10.48550/arxiv.2402.09034</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Learning</subject><creationdate>2024-02</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2402.09034$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2402.09034$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Subramanian, Barathi</creatorcontrib><creatorcontrib>Jeyaraj, Rathinaraja</creatorcontrib><creatorcontrib>Ugli, Rakhmonov Akhrorjon Akhmadjon</creatorcontrib><creatorcontrib>Kim, Jeonghong</creatorcontrib><title>Enhancing Sequential Model Performance with Squared Sigmoid TanH (SST) Activation Under Data Constraints</title><description>Activation functions enable neural networks to learn complex representations by introducing non-linearities. While feedforward models commonly use rectified linear units, sequential models like recurrent neural networks, long short-term memory (LSTMs) and gated recurrent units (GRUs) still rely on Sigmoid and TanH activation functions. However, these classical activation functions often struggle to model sparse patterns when trained on small sequential datasets to effectively capture temporal dependencies. To address this limitation, we propose squared Sigmoid TanH (SST) activation specifically tailored to enhance the learning capability of sequential models under data constraints. SST applies mathematical squaring to amplify differences between strong and weak activations as signals propagate over time, facilitating improved gradient flow and information filtering. We evaluate SST-powered LSTMs and GRUs for diverse applications, such as sign language recognition, regression, and time-series classification tasks, where the dataset is limited. Our experiments demonstrate that SST models consistently outperform RNN-based models with baseline activations, exhibiting improved test accuracy.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz7FOwzAUhWEvDKjwAEzcEYYEJ9dJ7LEKhSIVgZQwR1ex01hKHOq4pbw9tDCd4ZeO9DF2k_BYyCzjD-SP9hCngqcxVxzFJetXrifXWreFyuz2xgVLA7xO2gzwbnw3-fE3G_iyoYdqtydvNFR2O05WQ01uDXdVVd_Dsg32QMFODj6cNh4eKRCUk5uDJ-vCfMUuOhpmc_2_C1Y_repyHW3enl_K5SaivBAR5oXMSSQmUVIJlSnSKuU8F4I4YkokJZJB1Dwpui7VScuzNlcZqkJq2SIu2O3f7ZnafHo7kv9uTuTmTMYf8-VQXQ</recordid><startdate>20240214</startdate><enddate>20240214</enddate><creator>Subramanian, Barathi</creator><creator>Jeyaraj, Rathinaraja</creator><creator>Ugli, Rakhmonov Akhrorjon Akhmadjon</creator><creator>Kim, Jeonghong</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240214</creationdate><title>Enhancing Sequential Model Performance with Squared Sigmoid TanH (SST) Activation Under Data Constraints</title><author>Subramanian, Barathi ; Jeyaraj, Rathinaraja ; Ugli, Rakhmonov Akhrorjon Akhmadjon ; Kim, Jeonghong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a674-36786a41e19894959ad9200644a0332aa883ae33d017ff2d1c05c6953978d8c33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Subramanian, Barathi</creatorcontrib><creatorcontrib>Jeyaraj, Rathinaraja</creatorcontrib><creatorcontrib>Ugli, Rakhmonov Akhrorjon Akhmadjon</creatorcontrib><creatorcontrib>Kim, Jeonghong</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Subramanian, Barathi</au><au>Jeyaraj, Rathinaraja</au><au>Ugli, Rakhmonov Akhrorjon Akhmadjon</au><au>Kim, Jeonghong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Enhancing Sequential Model Performance with Squared Sigmoid TanH (SST) Activation Under Data Constraints</atitle><date>2024-02-14</date><risdate>2024</risdate><abstract>Activation functions enable neural networks to learn complex representations by introducing non-linearities. While feedforward models commonly use rectified linear units, sequential models like recurrent neural networks, long short-term memory (LSTMs) and gated recurrent units (GRUs) still rely on Sigmoid and TanH activation functions. However, these classical activation functions often struggle to model sparse patterns when trained on small sequential datasets to effectively capture temporal dependencies. To address this limitation, we propose squared Sigmoid TanH (SST) activation specifically tailored to enhance the learning capability of sequential models under data constraints. SST applies mathematical squaring to amplify differences between strong and weak activations as signals propagate over time, facilitating improved gradient flow and information filtering. We evaluate SST-powered LSTMs and GRUs for diverse applications, such as sign language recognition, regression, and time-series classification tasks, where the dataset is limited. Our experiments demonstrate that SST models consistently outperform RNN-based models with baseline activations, exhibiting improved test accuracy.</abstract><doi>10.48550/arxiv.2402.09034</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2402.09034
ispartof
issn
language eng
recordid cdi_arxiv_primary_2402_09034
source arXiv.org
subjects Computer Science - Artificial Intelligence
Computer Science - Learning
title Enhancing Sequential Model Performance with Squared Sigmoid TanH (SST) Activation Under Data Constraints
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T15%3A15%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Enhancing%20Sequential%20Model%20Performance%20with%20Squared%20Sigmoid%20TanH%20(SST)%20Activation%20Under%20Data%20Constraints&rft.au=Subramanian,%20Barathi&rft.date=2024-02-14&rft_id=info:doi/10.48550/arxiv.2402.09034&rft_dat=%3Carxiv_GOX%3E2402_09034%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true