Enhancing Sequential Model Performance with Squared Sigmoid TanH (SST) Activation Under Data Constraints
Activation functions enable neural networks to learn complex representations by introducing non-linearities. While feedforward models commonly use rectified linear units, sequential models like recurrent neural networks, long short-term memory (LSTMs) and gated recurrent units (GRUs) still rely on S...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Subramanian, Barathi Jeyaraj, Rathinaraja Ugli, Rakhmonov Akhrorjon Akhmadjon Kim, Jeonghong |
description | Activation functions enable neural networks to learn complex representations
by introducing non-linearities. While feedforward models commonly use rectified
linear units, sequential models like recurrent neural networks, long short-term
memory (LSTMs) and gated recurrent units (GRUs) still rely on Sigmoid and TanH
activation functions. However, these classical activation functions often
struggle to model sparse patterns when trained on small sequential datasets to
effectively capture temporal dependencies. To address this limitation, we
propose squared Sigmoid TanH (SST) activation specifically tailored to enhance
the learning capability of sequential models under data constraints. SST
applies mathematical squaring to amplify differences between strong and weak
activations as signals propagate over time, facilitating improved gradient flow
and information filtering. We evaluate SST-powered LSTMs and GRUs for diverse
applications, such as sign language recognition, regression, and time-series
classification tasks, where the dataset is limited. Our experiments demonstrate
that SST models consistently outperform RNN-based models with baseline
activations, exhibiting improved test accuracy. |
doi_str_mv | 10.48550/arxiv.2402.09034 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2402_09034</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2402_09034</sourcerecordid><originalsourceid>FETCH-LOGICAL-a674-36786a41e19894959ad9200644a0332aa883ae33d017ff2d1c05c6953978d8c33</originalsourceid><addsrcrecordid>eNotz7FOwzAUhWEvDKjwAEzcEYYEJ9dJ7LEKhSIVgZQwR1ex01hKHOq4pbw9tDCd4ZeO9DF2k_BYyCzjD-SP9hCngqcxVxzFJetXrifXWreFyuz2xgVLA7xO2gzwbnw3-fE3G_iyoYdqtydvNFR2O05WQ01uDXdVVd_Dsg32QMFODj6cNh4eKRCUk5uDJ-vCfMUuOhpmc_2_C1Y_repyHW3enl_K5SaivBAR5oXMSSQmUVIJlSnSKuU8F4I4YkokJZJB1Dwpui7VScuzNlcZqkJq2SIu2O3f7ZnafHo7kv9uTuTmTMYf8-VQXQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Enhancing Sequential Model Performance with Squared Sigmoid TanH (SST) Activation Under Data Constraints</title><source>arXiv.org</source><creator>Subramanian, Barathi ; Jeyaraj, Rathinaraja ; Ugli, Rakhmonov Akhrorjon Akhmadjon ; Kim, Jeonghong</creator><creatorcontrib>Subramanian, Barathi ; Jeyaraj, Rathinaraja ; Ugli, Rakhmonov Akhrorjon Akhmadjon ; Kim, Jeonghong</creatorcontrib><description>Activation functions enable neural networks to learn complex representations
by introducing non-linearities. While feedforward models commonly use rectified
linear units, sequential models like recurrent neural networks, long short-term
memory (LSTMs) and gated recurrent units (GRUs) still rely on Sigmoid and TanH
activation functions. However, these classical activation functions often
struggle to model sparse patterns when trained on small sequential datasets to
effectively capture temporal dependencies. To address this limitation, we
propose squared Sigmoid TanH (SST) activation specifically tailored to enhance
the learning capability of sequential models under data constraints. SST
applies mathematical squaring to amplify differences between strong and weak
activations as signals propagate over time, facilitating improved gradient flow
and information filtering. We evaluate SST-powered LSTMs and GRUs for diverse
applications, such as sign language recognition, regression, and time-series
classification tasks, where the dataset is limited. Our experiments demonstrate
that SST models consistently outperform RNN-based models with baseline
activations, exhibiting improved test accuracy.</description><identifier>DOI: 10.48550/arxiv.2402.09034</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Learning</subject><creationdate>2024-02</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2402.09034$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2402.09034$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Subramanian, Barathi</creatorcontrib><creatorcontrib>Jeyaraj, Rathinaraja</creatorcontrib><creatorcontrib>Ugli, Rakhmonov Akhrorjon Akhmadjon</creatorcontrib><creatorcontrib>Kim, Jeonghong</creatorcontrib><title>Enhancing Sequential Model Performance with Squared Sigmoid TanH (SST) Activation Under Data Constraints</title><description>Activation functions enable neural networks to learn complex representations
by introducing non-linearities. While feedforward models commonly use rectified
linear units, sequential models like recurrent neural networks, long short-term
memory (LSTMs) and gated recurrent units (GRUs) still rely on Sigmoid and TanH
activation functions. However, these classical activation functions often
struggle to model sparse patterns when trained on small sequential datasets to
effectively capture temporal dependencies. To address this limitation, we
propose squared Sigmoid TanH (SST) activation specifically tailored to enhance
the learning capability of sequential models under data constraints. SST
applies mathematical squaring to amplify differences between strong and weak
activations as signals propagate over time, facilitating improved gradient flow
and information filtering. We evaluate SST-powered LSTMs and GRUs for diverse
applications, such as sign language recognition, regression, and time-series
classification tasks, where the dataset is limited. Our experiments demonstrate
that SST models consistently outperform RNN-based models with baseline
activations, exhibiting improved test accuracy.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz7FOwzAUhWEvDKjwAEzcEYYEJ9dJ7LEKhSIVgZQwR1ex01hKHOq4pbw9tDCd4ZeO9DF2k_BYyCzjD-SP9hCngqcxVxzFJetXrifXWreFyuz2xgVLA7xO2gzwbnw3-fE3G_iyoYdqtydvNFR2O05WQ01uDXdVVd_Dsg32QMFODj6cNh4eKRCUk5uDJ-vCfMUuOhpmc_2_C1Y_repyHW3enl_K5SaivBAR5oXMSSQmUVIJlSnSKuU8F4I4YkokJZJB1Dwpui7VScuzNlcZqkJq2SIu2O3f7ZnafHo7kv9uTuTmTMYf8-VQXQ</recordid><startdate>20240214</startdate><enddate>20240214</enddate><creator>Subramanian, Barathi</creator><creator>Jeyaraj, Rathinaraja</creator><creator>Ugli, Rakhmonov Akhrorjon Akhmadjon</creator><creator>Kim, Jeonghong</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240214</creationdate><title>Enhancing Sequential Model Performance with Squared Sigmoid TanH (SST) Activation Under Data Constraints</title><author>Subramanian, Barathi ; Jeyaraj, Rathinaraja ; Ugli, Rakhmonov Akhrorjon Akhmadjon ; Kim, Jeonghong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a674-36786a41e19894959ad9200644a0332aa883ae33d017ff2d1c05c6953978d8c33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Subramanian, Barathi</creatorcontrib><creatorcontrib>Jeyaraj, Rathinaraja</creatorcontrib><creatorcontrib>Ugli, Rakhmonov Akhrorjon Akhmadjon</creatorcontrib><creatorcontrib>Kim, Jeonghong</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Subramanian, Barathi</au><au>Jeyaraj, Rathinaraja</au><au>Ugli, Rakhmonov Akhrorjon Akhmadjon</au><au>Kim, Jeonghong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Enhancing Sequential Model Performance with Squared Sigmoid TanH (SST) Activation Under Data Constraints</atitle><date>2024-02-14</date><risdate>2024</risdate><abstract>Activation functions enable neural networks to learn complex representations
by introducing non-linearities. While feedforward models commonly use rectified
linear units, sequential models like recurrent neural networks, long short-term
memory (LSTMs) and gated recurrent units (GRUs) still rely on Sigmoid and TanH
activation functions. However, these classical activation functions often
struggle to model sparse patterns when trained on small sequential datasets to
effectively capture temporal dependencies. To address this limitation, we
propose squared Sigmoid TanH (SST) activation specifically tailored to enhance
the learning capability of sequential models under data constraints. SST
applies mathematical squaring to amplify differences between strong and weak
activations as signals propagate over time, facilitating improved gradient flow
and information filtering. We evaluate SST-powered LSTMs and GRUs for diverse
applications, such as sign language recognition, regression, and time-series
classification tasks, where the dataset is limited. Our experiments demonstrate
that SST models consistently outperform RNN-based models with baseline
activations, exhibiting improved test accuracy.</abstract><doi>10.48550/arxiv.2402.09034</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2402.09034 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2402_09034 |
source | arXiv.org |
subjects | Computer Science - Artificial Intelligence Computer Science - Learning |
title | Enhancing Sequential Model Performance with Squared Sigmoid TanH (SST) Activation Under Data Constraints |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T15%3A15%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Enhancing%20Sequential%20Model%20Performance%20with%20Squared%20Sigmoid%20TanH%20(SST)%20Activation%20Under%20Data%20Constraints&rft.au=Subramanian,%20Barathi&rft.date=2024-02-14&rft_id=info:doi/10.48550/arxiv.2402.09034&rft_dat=%3Carxiv_GOX%3E2402_09034%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |