Comparative study of CNN, LSTM and hybrid CNN-LSTM model in amazigh speech recognition using spectrogram feature extraction and different gender and age dataset
The field of artificial intelligence has witnessed remarkable advancements in speech recognition technology. Among the forefront contenders in this domain are Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN). However, when it comes to their efficacy in recognizing the Amazigh...
Gespeichert in:
Veröffentlicht in: | International journal of speech technology 2024, Vol.27 (4), p.1121-1133 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1133 |
---|---|
container_issue | 4 |
container_start_page | 1121 |
container_title | International journal of speech technology |
container_volume | 27 |
creator | Telmem, Meryam Laaidi, Naouar Ghanou, Youssef Hamiane, Sanae Satori, Hassan |
description | The field of artificial intelligence has witnessed remarkable advancements in speech recognition technology. Among the forefront contenders in this domain are Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN). However, when it comes to their efficacy in recognizing the Amazigh language, which network reigns supreme? This article presents a comparative study of Convolutional Neural Networks (CNN), Long Short Term Memory (LSTM), and a hybrid CNN-LSTM model in the context of speech recognition systems. The main objective of this work is to identify which network architecture delivers the best performance for recognizing the Amazigh language. Our research stands out as one of the first to develop and compare three distinct deep models specifically for the Amazigh language, effectively addressing the challenges posed by a low-resource language. Through a series of rigorous experiments and evaluations conducted using the Tifdigit dataset, the study’s results underscore the superiority of CNNs in Amazigh speech recognition with 88% of accuracy when the CNN trained with female category dataset. |
doi_str_mv | 10.1007/s10772-024-10154-0 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3145727024</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3145727024</sourcerecordid><originalsourceid>FETCH-LOGICAL-c1150-6a59e89e8e9759ce4ecb77300d36895331273f4ae0c7d2fa3e77bba7d88c0f293</originalsourceid><addsrcrecordid>eNp9kUFr3DAQhU1poGnSP9CToNc4HVn2yj6WpU0C2_TQ5CxmpZFXIZY2klyy_TX5qdHuFnIrCCSevvcG5lXVZw6XHEB-TRykbGpo2poD79oa3lWnvCtSzzm8L2_R87pp-eJD9TGlBwAY5NCcVi_LMG0xYnZ_iKU8mx0Lli1vby_Y6vfdT4besM1uHZ3Zi_VBm4KhR-Y8wwn_unHD0pZIb1gkHUbvsguezcn5cf-hcwxjxIlZwjxHYvScI-oDtA83zlqK5DMbyRuKBxFHYgYzJsrn1YnFx0Sf_t1n1f2P73fL63r16-pm-W1Va847qBfYDdSXQ4PsBk0t6bWUAsCIRT90QvBGCtsigZamsShIyvUapel7DbYZxFn15Zi7jeFpppTVQ5ijLyOV4G3ZpCzLLVRzpHQMKUWyahvdhHGnOKh9E-rYhCqwOjShoJjE0ZQK7EeKb9H_cb0Ck4iNAw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3145727024</pqid></control><display><type>article</type><title>Comparative study of CNN, LSTM and hybrid CNN-LSTM model in amazigh speech recognition using spectrogram feature extraction and different gender and age dataset</title><source>SpringerNature Journals</source><creator>Telmem, Meryam ; Laaidi, Naouar ; Ghanou, Youssef ; Hamiane, Sanae ; Satori, Hassan</creator><creatorcontrib>Telmem, Meryam ; Laaidi, Naouar ; Ghanou, Youssef ; Hamiane, Sanae ; Satori, Hassan</creatorcontrib><description>The field of artificial intelligence has witnessed remarkable advancements in speech recognition technology. Among the forefront contenders in this domain are Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN). However, when it comes to their efficacy in recognizing the Amazigh language, which network reigns supreme? This article presents a comparative study of Convolutional Neural Networks (CNN), Long Short Term Memory (LSTM), and a hybrid CNN-LSTM model in the context of speech recognition systems. The main objective of this work is to identify which network architecture delivers the best performance for recognizing the Amazigh language. Our research stands out as one of the first to develop and compare three distinct deep models specifically for the Amazigh language, effectively addressing the challenges posed by a low-resource language. Through a series of rigorous experiments and evaluations conducted using the Tifdigit dataset, the study’s results underscore the superiority of CNNs in Amazigh speech recognition with 88% of accuracy when the CNN trained with female category dataset.</description><identifier>ISSN: 1381-2416</identifier><identifier>EISSN: 1572-8110</identifier><identifier>DOI: 10.1007/s10772-024-10154-0</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Artificial Intelligence ; Artificial neural networks ; Berber languages ; Comparative linguistics ; Comparative studies ; Datasets ; Engineering ; Mass media ; Neural networks ; Recurrent neural networks ; Short term memory ; Signal,Image and Speech Processing ; Social Sciences ; Speech recognition ; Voice recognition</subject><ispartof>International journal of speech technology, 2024, Vol.27 (4), p.1121-1133</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024 Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><rights>Copyright Springer Nature B.V. 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c1150-6a59e89e8e9759ce4ecb77300d36895331273f4ae0c7d2fa3e77bba7d88c0f293</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10772-024-10154-0$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10772-024-10154-0$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Telmem, Meryam</creatorcontrib><creatorcontrib>Laaidi, Naouar</creatorcontrib><creatorcontrib>Ghanou, Youssef</creatorcontrib><creatorcontrib>Hamiane, Sanae</creatorcontrib><creatorcontrib>Satori, Hassan</creatorcontrib><title>Comparative study of CNN, LSTM and hybrid CNN-LSTM model in amazigh speech recognition using spectrogram feature extraction and different gender and age dataset</title><title>International journal of speech technology</title><addtitle>Int J Speech Technol</addtitle><description>The field of artificial intelligence has witnessed remarkable advancements in speech recognition technology. Among the forefront contenders in this domain are Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN). However, when it comes to their efficacy in recognizing the Amazigh language, which network reigns supreme? This article presents a comparative study of Convolutional Neural Networks (CNN), Long Short Term Memory (LSTM), and a hybrid CNN-LSTM model in the context of speech recognition systems. The main objective of this work is to identify which network architecture delivers the best performance for recognizing the Amazigh language. Our research stands out as one of the first to develop and compare three distinct deep models specifically for the Amazigh language, effectively addressing the challenges posed by a low-resource language. Through a series of rigorous experiments and evaluations conducted using the Tifdigit dataset, the study’s results underscore the superiority of CNNs in Amazigh speech recognition with 88% of accuracy when the CNN trained with female category dataset.</description><subject>Artificial Intelligence</subject><subject>Artificial neural networks</subject><subject>Berber languages</subject><subject>Comparative linguistics</subject><subject>Comparative studies</subject><subject>Datasets</subject><subject>Engineering</subject><subject>Mass media</subject><subject>Neural networks</subject><subject>Recurrent neural networks</subject><subject>Short term memory</subject><subject>Signal,Image and Speech Processing</subject><subject>Social Sciences</subject><subject>Speech recognition</subject><subject>Voice recognition</subject><issn>1381-2416</issn><issn>1572-8110</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kUFr3DAQhU1poGnSP9CToNc4HVn2yj6WpU0C2_TQ5CxmpZFXIZY2klyy_TX5qdHuFnIrCCSevvcG5lXVZw6XHEB-TRykbGpo2poD79oa3lWnvCtSzzm8L2_R87pp-eJD9TGlBwAY5NCcVi_LMG0xYnZ_iKU8mx0Lli1vby_Y6vfdT4besM1uHZ3Zi_VBm4KhR-Y8wwn_unHD0pZIb1gkHUbvsguezcn5cf-hcwxjxIlZwjxHYvScI-oDtA83zlqK5DMbyRuKBxFHYgYzJsrn1YnFx0Sf_t1n1f2P73fL63r16-pm-W1Va847qBfYDdSXQ4PsBk0t6bWUAsCIRT90QvBGCtsigZamsShIyvUapel7DbYZxFn15Zi7jeFpppTVQ5ijLyOV4G3ZpCzLLVRzpHQMKUWyahvdhHGnOKh9E-rYhCqwOjShoJjE0ZQK7EeKb9H_cb0Ck4iNAw</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Telmem, Meryam</creator><creator>Laaidi, Naouar</creator><creator>Ghanou, Youssef</creator><creator>Hamiane, Sanae</creator><creator>Satori, Hassan</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7T9</scope></search><sort><creationdate>2024</creationdate><title>Comparative study of CNN, LSTM and hybrid CNN-LSTM model in amazigh speech recognition using spectrogram feature extraction and different gender and age dataset</title><author>Telmem, Meryam ; Laaidi, Naouar ; Ghanou, Youssef ; Hamiane, Sanae ; Satori, Hassan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c1150-6a59e89e8e9759ce4ecb77300d36895331273f4ae0c7d2fa3e77bba7d88c0f293</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Artificial Intelligence</topic><topic>Artificial neural networks</topic><topic>Berber languages</topic><topic>Comparative linguistics</topic><topic>Comparative studies</topic><topic>Datasets</topic><topic>Engineering</topic><topic>Mass media</topic><topic>Neural networks</topic><topic>Recurrent neural networks</topic><topic>Short term memory</topic><topic>Signal,Image and Speech Processing</topic><topic>Social Sciences</topic><topic>Speech recognition</topic><topic>Voice recognition</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Telmem, Meryam</creatorcontrib><creatorcontrib>Laaidi, Naouar</creatorcontrib><creatorcontrib>Ghanou, Youssef</creatorcontrib><creatorcontrib>Hamiane, Sanae</creatorcontrib><creatorcontrib>Satori, Hassan</creatorcontrib><collection>CrossRef</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><jtitle>International journal of speech technology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Telmem, Meryam</au><au>Laaidi, Naouar</au><au>Ghanou, Youssef</au><au>Hamiane, Sanae</au><au>Satori, Hassan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Comparative study of CNN, LSTM and hybrid CNN-LSTM model in amazigh speech recognition using spectrogram feature extraction and different gender and age dataset</atitle><jtitle>International journal of speech technology</jtitle><stitle>Int J Speech Technol</stitle><date>2024</date><risdate>2024</risdate><volume>27</volume><issue>4</issue><spage>1121</spage><epage>1133</epage><pages>1121-1133</pages><issn>1381-2416</issn><eissn>1572-8110</eissn><abstract>The field of artificial intelligence has witnessed remarkable advancements in speech recognition technology. Among the forefront contenders in this domain are Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN). However, when it comes to their efficacy in recognizing the Amazigh language, which network reigns supreme? This article presents a comparative study of Convolutional Neural Networks (CNN), Long Short Term Memory (LSTM), and a hybrid CNN-LSTM model in the context of speech recognition systems. The main objective of this work is to identify which network architecture delivers the best performance for recognizing the Amazigh language. Our research stands out as one of the first to develop and compare three distinct deep models specifically for the Amazigh language, effectively addressing the challenges posed by a low-resource language. Through a series of rigorous experiments and evaluations conducted using the Tifdigit dataset, the study’s results underscore the superiority of CNNs in Amazigh speech recognition with 88% of accuracy when the CNN trained with female category dataset.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10772-024-10154-0</doi><tpages>13</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1381-2416 |
ispartof | International journal of speech technology, 2024, Vol.27 (4), p.1121-1133 |
issn | 1381-2416 1572-8110 |
language | eng |
recordid | cdi_proquest_journals_3145727024 |
source | SpringerNature Journals |
subjects | Artificial Intelligence Artificial neural networks Berber languages Comparative linguistics Comparative studies Datasets Engineering Mass media Neural networks Recurrent neural networks Short term memory Signal,Image and Speech Processing Social Sciences Speech recognition Voice recognition |
title | Comparative study of CNN, LSTM and hybrid CNN-LSTM model in amazigh speech recognition using spectrogram feature extraction and different gender and age dataset |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T03%3A28%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Comparative%20study%20of%20CNN,%20LSTM%20and%20hybrid%20CNN-LSTM%20model%20in%20amazigh%20speech%20recognition%20using%20spectrogram%20feature%20extraction%20and%20different%20gender%20and%20age%20dataset&rft.jtitle=International%20journal%20of%20speech%20technology&rft.au=Telmem,%20Meryam&rft.date=2024&rft.volume=27&rft.issue=4&rft.spage=1121&rft.epage=1133&rft.pages=1121-1133&rft.issn=1381-2416&rft.eissn=1572-8110&rft_id=info:doi/10.1007/s10772-024-10154-0&rft_dat=%3Cproquest_cross%3E3145727024%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3145727024&rft_id=info:pmid/&rfr_iscdi=true |