Comparative study of CNN, LSTM and hybrid CNN-LSTM model in amazigh speech recognition using spectrogram feature extraction and different gender and age dataset

The field of artificial intelligence has witnessed remarkable advancements in speech recognition technology. Among the forefront contenders in this domain are Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN). However, when it comes to their efficacy in recognizing the Amazigh...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of speech technology 2024, Vol.27 (4), p.1121-1133
Hauptverfasser: Telmem, Meryam, Laaidi, Naouar, Ghanou, Youssef, Hamiane, Sanae, Satori, Hassan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1133
container_issue 4
container_start_page 1121
container_title International journal of speech technology
container_volume 27
creator Telmem, Meryam
Laaidi, Naouar
Ghanou, Youssef
Hamiane, Sanae
Satori, Hassan
description The field of artificial intelligence has witnessed remarkable advancements in speech recognition technology. Among the forefront contenders in this domain are Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN). However, when it comes to their efficacy in recognizing the Amazigh language, which network reigns supreme? This article presents a comparative study of Convolutional Neural Networks (CNN), Long Short Term Memory (LSTM), and a hybrid CNN-LSTM model in the context of speech recognition systems. The main objective of this work is to identify which network architecture delivers the best performance for recognizing the Amazigh language. Our research stands out as one of the first to develop and compare three distinct deep models specifically for the Amazigh language, effectively addressing the challenges posed by a low-resource language. Through a series of rigorous experiments and evaluations conducted using the Tifdigit dataset, the study’s results underscore the superiority of CNNs in Amazigh speech recognition with 88% of accuracy when the CNN trained with female category dataset.
doi_str_mv 10.1007/s10772-024-10154-0
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3145727024</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3145727024</sourcerecordid><originalsourceid>FETCH-LOGICAL-c1150-6a59e89e8e9759ce4ecb77300d36895331273f4ae0c7d2fa3e77bba7d88c0f293</originalsourceid><addsrcrecordid>eNp9kUFr3DAQhU1poGnSP9CToNc4HVn2yj6WpU0C2_TQ5CxmpZFXIZY2klyy_TX5qdHuFnIrCCSevvcG5lXVZw6XHEB-TRykbGpo2poD79oa3lWnvCtSzzm8L2_R87pp-eJD9TGlBwAY5NCcVi_LMG0xYnZ_iKU8mx0Lli1vby_Y6vfdT4besM1uHZ3Zi_VBm4KhR-Y8wwn_unHD0pZIb1gkHUbvsguezcn5cf-hcwxjxIlZwjxHYvScI-oDtA83zlqK5DMbyRuKBxFHYgYzJsrn1YnFx0Sf_t1n1f2P73fL63r16-pm-W1Va847qBfYDdSXQ4PsBk0t6bWUAsCIRT90QvBGCtsigZamsShIyvUapel7DbYZxFn15Zi7jeFpppTVQ5ijLyOV4G3ZpCzLLVRzpHQMKUWyahvdhHGnOKh9E-rYhCqwOjShoJjE0ZQK7EeKb9H_cb0Ck4iNAw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3145727024</pqid></control><display><type>article</type><title>Comparative study of CNN, LSTM and hybrid CNN-LSTM model in amazigh speech recognition using spectrogram feature extraction and different gender and age dataset</title><source>SpringerNature Journals</source><creator>Telmem, Meryam ; Laaidi, Naouar ; Ghanou, Youssef ; Hamiane, Sanae ; Satori, Hassan</creator><creatorcontrib>Telmem, Meryam ; Laaidi, Naouar ; Ghanou, Youssef ; Hamiane, Sanae ; Satori, Hassan</creatorcontrib><description>The field of artificial intelligence has witnessed remarkable advancements in speech recognition technology. Among the forefront contenders in this domain are Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN). However, when it comes to their efficacy in recognizing the Amazigh language, which network reigns supreme? This article presents a comparative study of Convolutional Neural Networks (CNN), Long Short Term Memory (LSTM), and a hybrid CNN-LSTM model in the context of speech recognition systems. The main objective of this work is to identify which network architecture delivers the best performance for recognizing the Amazigh language. Our research stands out as one of the first to develop and compare three distinct deep models specifically for the Amazigh language, effectively addressing the challenges posed by a low-resource language. Through a series of rigorous experiments and evaluations conducted using the Tifdigit dataset, the study’s results underscore the superiority of CNNs in Amazigh speech recognition with 88% of accuracy when the CNN trained with female category dataset.</description><identifier>ISSN: 1381-2416</identifier><identifier>EISSN: 1572-8110</identifier><identifier>DOI: 10.1007/s10772-024-10154-0</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Artificial Intelligence ; Artificial neural networks ; Berber languages ; Comparative linguistics ; Comparative studies ; Datasets ; Engineering ; Mass media ; Neural networks ; Recurrent neural networks ; Short term memory ; Signal,Image and Speech Processing ; Social Sciences ; Speech recognition ; Voice recognition</subject><ispartof>International journal of speech technology, 2024, Vol.27 (4), p.1121-1133</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024 Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><rights>Copyright Springer Nature B.V. 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c1150-6a59e89e8e9759ce4ecb77300d36895331273f4ae0c7d2fa3e77bba7d88c0f293</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10772-024-10154-0$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10772-024-10154-0$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Telmem, Meryam</creatorcontrib><creatorcontrib>Laaidi, Naouar</creatorcontrib><creatorcontrib>Ghanou, Youssef</creatorcontrib><creatorcontrib>Hamiane, Sanae</creatorcontrib><creatorcontrib>Satori, Hassan</creatorcontrib><title>Comparative study of CNN, LSTM and hybrid CNN-LSTM model in amazigh speech recognition using spectrogram feature extraction and different gender and age dataset</title><title>International journal of speech technology</title><addtitle>Int J Speech Technol</addtitle><description>The field of artificial intelligence has witnessed remarkable advancements in speech recognition technology. Among the forefront contenders in this domain are Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN). However, when it comes to their efficacy in recognizing the Amazigh language, which network reigns supreme? This article presents a comparative study of Convolutional Neural Networks (CNN), Long Short Term Memory (LSTM), and a hybrid CNN-LSTM model in the context of speech recognition systems. The main objective of this work is to identify which network architecture delivers the best performance for recognizing the Amazigh language. Our research stands out as one of the first to develop and compare three distinct deep models specifically for the Amazigh language, effectively addressing the challenges posed by a low-resource language. Through a series of rigorous experiments and evaluations conducted using the Tifdigit dataset, the study’s results underscore the superiority of CNNs in Amazigh speech recognition with 88% of accuracy when the CNN trained with female category dataset.</description><subject>Artificial Intelligence</subject><subject>Artificial neural networks</subject><subject>Berber languages</subject><subject>Comparative linguistics</subject><subject>Comparative studies</subject><subject>Datasets</subject><subject>Engineering</subject><subject>Mass media</subject><subject>Neural networks</subject><subject>Recurrent neural networks</subject><subject>Short term memory</subject><subject>Signal,Image and Speech Processing</subject><subject>Social Sciences</subject><subject>Speech recognition</subject><subject>Voice recognition</subject><issn>1381-2416</issn><issn>1572-8110</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kUFr3DAQhU1poGnSP9CToNc4HVn2yj6WpU0C2_TQ5CxmpZFXIZY2klyy_TX5qdHuFnIrCCSevvcG5lXVZw6XHEB-TRykbGpo2poD79oa3lWnvCtSzzm8L2_R87pp-eJD9TGlBwAY5NCcVi_LMG0xYnZ_iKU8mx0Lli1vby_Y6vfdT4besM1uHZ3Zi_VBm4KhR-Y8wwn_unHD0pZIb1gkHUbvsguezcn5cf-hcwxjxIlZwjxHYvScI-oDtA83zlqK5DMbyRuKBxFHYgYzJsrn1YnFx0Sf_t1n1f2P73fL63r16-pm-W1Va847qBfYDdSXQ4PsBk0t6bWUAsCIRT90QvBGCtsigZamsShIyvUapel7DbYZxFn15Zi7jeFpppTVQ5ijLyOV4G3ZpCzLLVRzpHQMKUWyahvdhHGnOKh9E-rYhCqwOjShoJjE0ZQK7EeKb9H_cb0Ck4iNAw</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Telmem, Meryam</creator><creator>Laaidi, Naouar</creator><creator>Ghanou, Youssef</creator><creator>Hamiane, Sanae</creator><creator>Satori, Hassan</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7T9</scope></search><sort><creationdate>2024</creationdate><title>Comparative study of CNN, LSTM and hybrid CNN-LSTM model in amazigh speech recognition using spectrogram feature extraction and different gender and age dataset</title><author>Telmem, Meryam ; Laaidi, Naouar ; Ghanou, Youssef ; Hamiane, Sanae ; Satori, Hassan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c1150-6a59e89e8e9759ce4ecb77300d36895331273f4ae0c7d2fa3e77bba7d88c0f293</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Artificial Intelligence</topic><topic>Artificial neural networks</topic><topic>Berber languages</topic><topic>Comparative linguistics</topic><topic>Comparative studies</topic><topic>Datasets</topic><topic>Engineering</topic><topic>Mass media</topic><topic>Neural networks</topic><topic>Recurrent neural networks</topic><topic>Short term memory</topic><topic>Signal,Image and Speech Processing</topic><topic>Social Sciences</topic><topic>Speech recognition</topic><topic>Voice recognition</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Telmem, Meryam</creatorcontrib><creatorcontrib>Laaidi, Naouar</creatorcontrib><creatorcontrib>Ghanou, Youssef</creatorcontrib><creatorcontrib>Hamiane, Sanae</creatorcontrib><creatorcontrib>Satori, Hassan</creatorcontrib><collection>CrossRef</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><jtitle>International journal of speech technology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Telmem, Meryam</au><au>Laaidi, Naouar</au><au>Ghanou, Youssef</au><au>Hamiane, Sanae</au><au>Satori, Hassan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Comparative study of CNN, LSTM and hybrid CNN-LSTM model in amazigh speech recognition using spectrogram feature extraction and different gender and age dataset</atitle><jtitle>International journal of speech technology</jtitle><stitle>Int J Speech Technol</stitle><date>2024</date><risdate>2024</risdate><volume>27</volume><issue>4</issue><spage>1121</spage><epage>1133</epage><pages>1121-1133</pages><issn>1381-2416</issn><eissn>1572-8110</eissn><abstract>The field of artificial intelligence has witnessed remarkable advancements in speech recognition technology. Among the forefront contenders in this domain are Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN). However, when it comes to their efficacy in recognizing the Amazigh language, which network reigns supreme? This article presents a comparative study of Convolutional Neural Networks (CNN), Long Short Term Memory (LSTM), and a hybrid CNN-LSTM model in the context of speech recognition systems. The main objective of this work is to identify which network architecture delivers the best performance for recognizing the Amazigh language. Our research stands out as one of the first to develop and compare three distinct deep models specifically for the Amazigh language, effectively addressing the challenges posed by a low-resource language. Through a series of rigorous experiments and evaluations conducted using the Tifdigit dataset, the study’s results underscore the superiority of CNNs in Amazigh speech recognition with 88% of accuracy when the CNN trained with female category dataset.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10772-024-10154-0</doi><tpages>13</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1381-2416
ispartof International journal of speech technology, 2024, Vol.27 (4), p.1121-1133
issn 1381-2416
1572-8110
language eng
recordid cdi_proquest_journals_3145727024
source SpringerNature Journals
subjects Artificial Intelligence
Artificial neural networks
Berber languages
Comparative linguistics
Comparative studies
Datasets
Engineering
Mass media
Neural networks
Recurrent neural networks
Short term memory
Signal,Image and Speech Processing
Social Sciences
Speech recognition
Voice recognition
title Comparative study of CNN, LSTM and hybrid CNN-LSTM model in amazigh speech recognition using spectrogram feature extraction and different gender and age dataset
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T03%3A28%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Comparative%20study%20of%20CNN,%20LSTM%20and%20hybrid%20CNN-LSTM%20model%20in%20amazigh%20speech%20recognition%20using%20spectrogram%20feature%20extraction%20and%20different%20gender%20and%20age%20dataset&rft.jtitle=International%20journal%20of%20speech%20technology&rft.au=Telmem,%20Meryam&rft.date=2024&rft.volume=27&rft.issue=4&rft.spage=1121&rft.epage=1133&rft.pages=1121-1133&rft.issn=1381-2416&rft.eissn=1572-8110&rft_id=info:doi/10.1007/s10772-024-10154-0&rft_dat=%3Cproquest_cross%3E3145727024%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3145727024&rft_id=info:pmid/&rfr_iscdi=true