Comparative study of CNN, LSTM and hybrid CNN-LSTM model in amazigh speech recognition using spectrogram feature extraction and different gender and age dataset

The field of artificial intelligence has witnessed remarkable advancements in speech recognition technology. Among the forefront contenders in this domain are Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN). However, when it comes to their efficacy in recognizing the Amazigh...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International journal of speech technology 2024, Vol.27 (4), p.1121-1133
Hauptverfasser:	Telmem, Meryam, Laaidi, Naouar, Ghanou, Youssef, Hamiane, Sanae, Satori, Hassan
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial Intelligence Artificial neural networks Berber languages Comparative linguistics Comparative studies Datasets Engineering Mass media Neural networks Recurrent neural networks Short term memory Signal,Image and Speech Processing Social Sciences Speech recognition Voice recognition
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1133
container_issue	4
container_start_page	1121
container_title	International journal of speech technology
container_volume	27
creator	Telmem, Meryam Laaidi, Naouar Ghanou, Youssef Hamiane, Sanae Satori, Hassan
description	The field of artificial intelligence has witnessed remarkable advancements in speech recognition technology. Among the forefront contenders in this domain are Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN). However, when it comes to their efficacy in recognizing the Amazigh language, which network reigns supreme? This article presents a comparative study of Convolutional Neural Networks (CNN), Long Short Term Memory (LSTM), and a hybrid CNN-LSTM model in the context of speech recognition systems. The main objective of this work is to identify which network architecture delivers the best performance for recognizing the Amazigh language. Our research stands out as one of the first to develop and compare three distinct deep models specifically for the Amazigh language, effectively addressing the challenges posed by a low-resource language. Through a series of rigorous experiments and evaluations conducted using the Tifdigit dataset, the study’s results underscore the superiority of CNNs in Amazigh speech recognition with 88% of accuracy when the CNN trained with female category dataset.
doi_str_mv	10.1007/s10772-024-10154-0
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3145727024</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3145727024</sourcerecordid><originalsourceid>FETCH-LOGICAL-c1150-6a59e89e8e9759ce4ecb77300d36895331273f4ae0c7d2fa3e77bba7d88c0f293</originalsourceid><addsrcrecordid>eNp9kUFr3DAQhU1poGnSP9CToNc4HVn2yj6WpU0C2_TQ5CxmpZFXIZY2klyy_TX5qdHuFnIrCCSevvcG5lXVZw6XHEB-TRykbGpo2poD79oa3lWnvCtSzzm8L2_R87pp-eJD9TGlBwAY5NCcVi_LMG0xYnZ_iKU8mx0Lli1vby_Y6vfdT4besM1uHZ3Zi_VBm4KhR-Y8wwn_unHD0pZIb1gkHUbvsguezcn5cf-hcwxjxIlZwjxHYvScI-oDtA83zlqK5DMbyRuKBxFHYgYzJsrn1YnFx0Sf_t1n1f2P73fL63r16-pm-W1Va847qBfYDdSXQ4PsBk0t6bWUAsCIRT90QvBGCtsigZamsShIyvUapel7DbYZxFn15Zi7jeFpppTVQ5ijLyOV4G3ZpCzLLVRzpHQMKUWyahvdhHGnOKh9E-rYhCqwOjShoJjE0ZQK7EeKb9H_cb0Ck4iNAw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3145727024</pqid></control><display><type>article</type><title>Comparative study of CNN, LSTM and hybrid CNN-LSTM model in amazigh speech recognition using spectrogram feature extraction and different gender and age dataset</title><source>SpringerNature Journals</source><creator>Telmem, Meryam ; Laaidi, Naouar ; Ghanou, Youssef ; Hamiane, Sanae ; Satori, Hassan</creator><creatorcontrib>Telmem, Meryam ; Laaidi, Naouar ; Ghanou, Youssef ; Hamiane, Sanae ; Satori, Hassan</creatorcontrib><description>The field of artificial intelligence has witnessed remarkable advancements in speech recognition technology. Among the forefront contenders in this domain are Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN). However, when it comes to their efficacy in recognizing the Amazigh language, which network reigns supreme? This article presents a comparative study of Convolutional Neural Networks (CNN), Long Short Term Memory (LSTM), and a hybrid CNN-LSTM model in the context of speech recognition systems. The main objective of this work is to identify which network architecture delivers the best performance for recognizing the Amazigh language. Our research stands out as one of the first to develop and compare three distinct deep models specifically for the Amazigh language, effectively addressing the challenges posed by a low-resource language. Through a series of rigorous experiments and evaluations conducted using the Tifdigit dataset, the study’s results underscore the superiority of CNNs in Amazigh speech recognition with 88% of accuracy when the CNN trained with female category dataset.</description><identifier>ISSN: 1381-2416</identifier><identifier>EISSN: 1572-8110</identifier><identifier>DOI: 10.1007/s10772-024-10154-0</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Artificial Intelligence ; Artificial neural networks ; Berber languages ; Comparative linguistics ; Comparative studies ; Datasets ; Engineering ; Mass media ; Neural networks ; Recurrent neural networks ; Short term memory ; Signal,Image and Speech Processing ; Social Sciences ; Speech recognition ; Voice recognition</subject><ispartof>International journal of speech technology, 2024, Vol.27 (4), p.1121-1133</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024 Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><rights>Copyright Springer Nature B.V. 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c1150-6a59e89e8e9759ce4ecb77300d36895331273f4ae0c7d2fa3e77bba7d88c0f293</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10772-024-10154-0$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10772-024-10154-0$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Telmem, Meryam</creatorcontrib><creatorcontrib>Laaidi, Naouar</creatorcontrib><creatorcontrib>Ghanou, Youssef</creatorcontrib><creatorcontrib>Hamiane, Sanae</creatorcontrib><creatorcontrib>Satori, Hassan</creatorcontrib><title>Comparative study of CNN, LSTM and hybrid CNN-LSTM model in amazigh speech recognition using spectrogram feature extraction and different gender and age dataset</title><title>International journal of speech technology</title><addtitle>Int J Speech Technol</addtitle><description>The field of artificial intelligence has witnessed remarkable advancements in speech recognition technology. Among the forefront contenders in this domain are Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN). However, when it comes to their efficacy in recognizing the Amazigh language, which network reigns supreme? This article presents a comparative study of Convolutional Neural Networks (CNN), Long Short Term Memory (LSTM), and a hybrid CNN-LSTM model in the context of speech recognition systems. The main objective of this work is to identify which network architecture delivers the best performance for recognizing the Amazigh language. Our research stands out as one of the first to develop and compare three distinct deep models specifically for the Amazigh language, effectively addressing the challenges posed by a low-resource language. Through a series of rigorous experiments and evaluations conducted using the Tifdigit dataset, the study’s results underscore the superiority of CNNs in Amazigh speech recognition with 88% of accuracy when the CNN trained with female category dataset.</description><subject>Artificial Intelligence</subject><subject>Artificial neural networks</subject><subject>Berber languages</subject><subject>Comparative linguistics</subject><subject>Comparative studies</subject><subject>Datasets</subject><subject>Engineering</subject><subject>Mass media</subject><subject>Neural networks</subject><subject>Recurrent neural networks</subject><subject>Short term memory</subject><subject>Signal,Image and Speech Processing</subject><subject>Social Sciences</subject><subject>Speech recognition</subject><subject>Voice recognition</subject><issn>1381-2416</issn><issn>1572-8110</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kUFr3DAQhU1poGnSP9CToNc4HVn2yj6WpU0C2_TQ5CxmpZFXIZY2klyy_TX5qdHuFnIrCCSevvcG5lXVZw6XHEB-TRykbGpo2poD79oa3lWnvCtSzzm8L2_R87pp-eJD9TGlBwAY5NCcVi_LMG0xYnZ_iKU8mx0Lli1vby_Y6vfdT4besM1uHZ3Zi_VBm4KhR-Y8wwn_unHD0pZIb1gkHUbvsguezcn5cf-hcwxjxIlZwjxHYvScI-oDtA83zlqK5DMbyRuKBxFHYgYzJsrn1YnFx0Sf_t1n1f2P73fL63r16-pm-W1Va847qBfYDdSXQ4PsBk0t6bWUAsCIRT90QvBGCtsigZamsShIyvUapel7DbYZxFn15Zi7jeFpppTVQ5ijLyOV4G3ZpCzLLVRzpHQMKUWyahvdhHGnOKh9E-rYhCqwOjShoJjE0ZQK7EeKb9H_cb0Ck4iNAw</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Telmem, Meryam</creator><creator>Laaidi, Naouar</creator><creator>Ghanou, Youssef</creator><creator>Hamiane, Sanae</creator><creator>Satori, Hassan</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7T9</scope></search><sort><creationdate>2024</creationdate><title>Comparative study of CNN, LSTM and hybrid CNN-LSTM model in amazigh speech recognition using spectrogram feature extraction and different gender and age dataset</title><author>Telmem, Meryam ; Laaidi, Naouar ; Ghanou, Youssef ; Hamiane, Sanae ; Satori, Hassan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c1150-6a59e89e8e9759ce4ecb77300d36895331273f4ae0c7d2fa3e77bba7d88c0f293</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Artificial Intelligence</topic><topic>Artificial neural networks</topic><topic>Berber languages</topic><topic>Comparative linguistics</topic><topic>Comparative studies</topic><topic>Datasets</topic><topic>Engineering</topic><topic>Mass media</topic><topic>Neural networks</topic><topic>Recurrent neural networks</topic><topic>Short term memory</topic><topic>Signal,Image and Speech Processing</topic><topic>Social Sciences</topic><topic>Speech recognition</topic><topic>Voice recognition</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Telmem, Meryam</creatorcontrib><creatorcontrib>Laaidi, Naouar</creatorcontrib><creatorcontrib>Ghanou, Youssef</creatorcontrib><creatorcontrib>Hamiane, Sanae</creatorcontrib><creatorcontrib>Satori, Hassan</creatorcontrib><collection>CrossRef</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><jtitle>International journal of speech technology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Telmem, Meryam</au><au>Laaidi, Naouar</au><au>Ghanou, Youssef</au><au>Hamiane, Sanae</au><au>Satori, Hassan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Comparative study of CNN, LSTM and hybrid CNN-LSTM model in amazigh speech recognition using spectrogram feature extraction and different gender and age dataset</atitle><jtitle>International journal of speech technology</jtitle><stitle>Int J Speech Technol</stitle><date>2024</date><risdate>2024</risdate><volume>27</volume><issue>4</issue><spage>1121</spage><epage>1133</epage><pages>1121-1133</pages><issn>1381-2416</issn><eissn>1572-8110</eissn><abstract>The field of artificial intelligence has witnessed remarkable advancements in speech recognition technology. Among the forefront contenders in this domain are Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN). However, when it comes to their efficacy in recognizing the Amazigh language, which network reigns supreme? This article presents a comparative study of Convolutional Neural Networks (CNN), Long Short Term Memory (LSTM), and a hybrid CNN-LSTM model in the context of speech recognition systems. The main objective of this work is to identify which network architecture delivers the best performance for recognizing the Amazigh language. Our research stands out as one of the first to develop and compare three distinct deep models specifically for the Amazigh language, effectively addressing the challenges posed by a low-resource language. Through a series of rigorous experiments and evaluations conducted using the Tifdigit dataset, the study’s results underscore the superiority of CNNs in Amazigh speech recognition with 88% of accuracy when the CNN trained with female category dataset.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10772-024-10154-0</doi><tpages>13</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 1381-2416
ispartof	International journal of speech technology, 2024, Vol.27 (4), p.1121-1133
issn	1381-2416 1572-8110
language	eng
recordid	cdi_proquest_journals_3145727024
source	SpringerNature Journals
subjects	Artificial Intelligence Artificial neural networks Berber languages Comparative linguistics Comparative studies Datasets Engineering Mass media Neural networks Recurrent neural networks Short term memory Signal,Image and Speech Processing Social Sciences Speech recognition Voice recognition
title	Comparative study of CNN, LSTM and hybrid CNN-LSTM model in amazigh speech recognition using spectrogram feature extraction and different gender and age dataset
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T03%3A28%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Comparative%20study%20of%20CNN,%20LSTM%20and%20hybrid%20CNN-LSTM%20model%20in%20amazigh%20speech%20recognition%20using%20spectrogram%20feature%20extraction%20and%20different%20gender%20and%20age%20dataset&rft.jtitle=International%20journal%20of%20speech%20technology&rft.au=Telmem,%20Meryam&rft.date=2024&rft.volume=27&rft.issue=4&rft.spage=1121&rft.epage=1133&rft.pages=1121-1133&rft.issn=1381-2416&rft.eissn=1572-8110&rft_id=info:doi/10.1007/s10772-024-10154-0&rft_dat=%3Cproquest_cross%3E3145727024%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3145727024&rft_id=info:pmid/&rfr_iscdi=true